Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure to iterate over all keys when cancelling keys in a SelectorPool #4471

Merged
merged 1 commit into from Feb 4, 2017

Conversation

@cheister
Copy link

cheister commented Feb 3, 2017

When we updated our server from 9.1.2.0 to 9.1.7.0 we started seeing a file descriptor leak which, if left for long enough, would run the machine out of file descriptors.

We tracked down the problem to #3952 which changed how SelectionKeys were closed.

It looks like if exceptions aren't caught while calling cancel on the SelectionKeys then the loop can exit early and leave some keys un-cancelled.

When we went back to the same key cancelling code that was used in 9.1.2.0 the problem went away.

@cheister
Copy link
Author

cheister commented Feb 3, 2017

@mohamedhafez
Copy link
Contributor

mohamedhafez commented Feb 3, 2017

Hmm, what exceptions are being thrown at https://github.com/jruby/jruby/pull/4471/files#diff-f99fc044de4e2aa68b3bb1f212348b07R92 ? If the cancelling fails for some reason and the SelectionKey is still valid and uncancelled, then it will lead to old SelectionKeys being returned in Selectors returned by retrieveFromPool, which is what was causing the problems that lead me to file #3952 in the first place.

If the SelectionKey isn't valid (i.e. calling isValid on it returns false), then it won't cause the problems I observed and I suppose it would be fine to ignore them the way we were doing prior to my PR, but perhaps it would be best to handle them?

@cheister
Copy link
Author

cheister commented Feb 3, 2017

The exceptions are getting swallowed somewhere so I haven't seen what they are. I could try adding logging to capture them if you're interested?

From what you're saying it sounds like I should remove the if (key.isValid()) check from the loop?

@mohamedhafez
Copy link
Contributor

mohamedhafez commented Feb 3, 2017

@cheister yeah could you put in logging to see what the exceptions are, and also to check the status of key.isValid() inside the catch statement at https://github.com/jruby/jruby/pull/4471/files#diff-f99fc044de4e2aa68b3bb1f212348b07R93

@cheister cheister force-pushed the cheister:file-descriptor-leak branch from 91a7b27 to cfff122 Feb 3, 2017
@cheister
Copy link
Author

cheister commented Feb 3, 2017

The exception being thrown was an NPE because the key was null. So changing the code to check for null also makes it work.

@cheister cheister force-pushed the cheister:file-descriptor-leak branch from cfff122 to ac1d2da Feb 3, 2017
@headius headius merged commit 3f97c7a into jruby:master Feb 4, 2017
1 check failed
1 check failed
continuous-integration/travis-ci/pr The Travis CI build failed
Details
@headius
Copy link
Member

headius commented Feb 4, 2017

Great find, thank you!

@headius headius added this to the JRuby 9.1.8.0 milestone Feb 4, 2017
@cheister cheister deleted the cheister:file-descriptor-leak branch Feb 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.