-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git storage file system watcher is leaked #750
Comments
Dropping a |
Not sure about the exact mechanism but the daemon, if it is used correctly, does make an effort to properly shutdown the protocol. |
Not sure about the exact mechanism but the daemon, if it is used
correctly, does make an effort to properly shutdown the protocol.
The `Peer` holds on to `Caches` in order to share it across protocol restarts.
Since the protocol itself also holds a reference to it, the file watchers won't
be dropped until the protocol is dropped. From the linked code, you can see that
the thread stops once the events iterator is exhausted (which it is when the
`Watcher` is dropped, and there is no other way because Rust developers like
their RAII).
Iow, if the protocol outlives the `Peer`, it will leak resources (and not
limited to caches). Unfortunately, I don't know of a way to ensure this
statically.
The only ways I can think of to fix this would be to either not share the caches
(which can incur significant startup delay on large repositories), persist
caches to disk (uh-oh), or tie the lifetime of the `Peer` handle to that of the
protocol it spawned (ie. disallow protocol restarts). The latter would mean that
callers wishing to outlive the `Peer` need to manage a mutable reference to it,
respectively guard it behind a mutex, which isn't great either.
|
Thanks for clarifying this. I guess this means that the daemon doesn’t shut down the protocol properly. Am I right in assuming that dropping the |
It should.
Hm, I see that this actually consumes |
And it looks like it does on latest master. I’m able to reproduce the “Too many open files” error on a previous version of |
librad leaks file system watchers. When a librad
Peer
is dropped there remain threads which watch the file system. When a lot ofPeer
s are created e.g. in tests this leads to “Too many open files” errors. This issue may also cause “Bad file descriptor errors” upstream is seeing on CI.The leak is caused by spawned threads here
radicle-link/librad/src/net/protocol/cache.rs
Lines 89 to 92 in bfdc3ec
and here
radicle-link/librad/src/net/protocol/cache.rs
Lines 139 to 142 in bfdc3ec
The text was updated successfully, but these errors were encountered: