Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git storage file system watcher is leaked #750

Closed
geigerzaehler opened this issue Aug 11, 2021 · 6 comments
Closed

Git storage file system watcher is leaked #750

geigerzaehler opened this issue Aug 11, 2021 · 6 comments

Comments

@geigerzaehler
Copy link
Contributor

librad leaks file system watchers. When a librad Peer is dropped there remain threads which watch the file system. When a lot of Peers are created e.g. in tests this leads to “Too many open files” errors. This issue may also cause “Bad file descriptor errors” upstream is seeing on CI.

The leak is caused by spawned threads here

thread::spawn({
let filter = Arc::clone(&inner);
move || recache_thread(storage, filter, events, observe)
});

and here

let bob = thread::spawn({
let span = span.clone();
let shutdown = Arc::clone(&shutdown);
let rebuild = Arc::clone(&rebuild);

@kim
Copy link
Contributor

kim commented Aug 11, 2021

Dropping a Peer does not drop the protocol state.

@geigerzaehler
Copy link
Contributor Author

Dropping a Peer does not drop the protocol state.

Not sure about the exact mechanism but the daemon, if it is used correctly, does make an effort to properly shutdown the protocol.

@kim
Copy link
Contributor

kim commented Aug 11, 2021 via email

@geigerzaehler
Copy link
Contributor Author

Thanks for clarifying this. I guess this means that the daemon doesn’t shut down the protocol properly. Am I right in assuming that dropping the Peer and calling the function returned by Bound::accept() while completing the future shuts down everything? Or is there something missing from radicle_daemon::peer::Peer::start()?

@kim
Copy link
Contributor

kim commented Aug 11, 2021

Am I right in assuming that dropping the Peer and calling the function returned by Bound::accept() while completing the future shuts down everything?

It should.

Or is there something missing from radicle_daemon::peer::Peer::start()?

Hm, I see that this actually consumes self which is not Clone. So I’m not sure how we could end up with a “dangling” Peer. Is this used somehow differently in upstream?

@geigerzaehler
Copy link
Contributor Author

It should.

And it looks like it does on latest master. I’m able to reproduce the “Too many open files” error on a previous version of radicle-proxy but not anymore. Using lsof also confirmed that inotify was the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants