Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use same parameter for DNS.allocate_request_id/free_request_id to fix leak #5722

Merged
merged 1 commit into from Oct 28, 2019

Conversation

@colinsurprenant
Copy link
Contributor

colinsurprenant commented Apr 29, 2019

As described in logstash-plugins/logstash-filter-dns#52

resolv.rb was updated from upstream Ruby stdlib starting at JRuby 9.2.0.0 which essentially re-introduced issue logstash-plugins/logstash-filter-dns#40 where all DNS request will eventually systematically time out.

This problem will only be triggered if there are multiple nameserver hosts specified OR if there is no nameserver host but there are more than one host in /etc/resolv.conf resulting in the usage of the UnconnectedUDP class in resolv.rb.

Essentially the problem is that the DNS.allocate_request_id(host, port) call will cache the allocated id using the host and port parameters.

But the cache cleanup will be done in the close method by the DNS.free_request_id(service[0], service[1], id) call where the service[0] will be IPAddr.new(host) and not the original host string used in the sender method leading to not finding the original cache key and not freeing it from the cache which is 64k elements wide and once full, the allocate_request_id method will loop forever.

This fix just makes sure that the exact same parameters are used in both DNS.allocate_request_id and DNS.free_request_id using the IPAddr#to_s which will always yield the same host string.

I am planning on submitting this issue upstream in Ruby but I figured I'd also submit here to provide a potential quicker turnaround for this nasty bug.

@kares kares added this to the JRuby 9.2.8.0 milestone Apr 30, 2019
@colinsurprenant

This comment has been minimized.

Copy link
Contributor Author

colinsurprenant commented Apr 30, 2019

I just checked in upstream Ruby and it actually does not have this problem.
This bug in the JRuby version of resolv.rb was actually introduced by #4496

@headius

This comment has been minimized.

Copy link
Member

headius commented Apr 30, 2019

@colinsurprenant Ahh thanks for the footwork on this. We need to sort out what's the correct logic and do our best to sync it up between JRuby and CRuby.

@headius

This comment has been minimized.

Copy link
Member

headius commented Apr 30, 2019

@colinsurprenant Oh, so...if we pull CRuby's actual resolv.rb unmodified are we reintroducing that other issue?

@colinsurprenant

This comment has been minimized.

Copy link
Contributor Author

colinsurprenant commented Apr 30, 2019

@headius yeah if you pull the latest CRuby resolv.rb it will revert #4496. I haven't really paid attention to what exactly #4496 does/fixes, I can certainly verify that and see if that fix is still needed and/or relevant in CRuby and to maybe submit it upstream?

@headius

This comment has been minimized.

Copy link
Member

headius commented Apr 30, 2019

@colinsurprenant Ah yes, @enebo refreshed my memory. So I believe #4496 was because our IPAddr and related structures were not quite matching CRuby...just another part of the complicated socket subsystem that we're not emulating exactly right. But things have changed in the meantime and a whole bunch of socket compat fixes got merged in within the last two years...we might be ok?

@colinsurprenant

This comment has been minimized.

Copy link
Contributor Author

colinsurprenant commented Apr 30, 2019

@headius sounds good, I can certainly take a look!

@headius

This comment has been minimized.

Copy link
Member

headius commented Aug 2, 2019

@colinsurprenant Any status update on this? I don't have a problem merging this to fix JRuby but it would obviously be nicer if we could move toward MRI's version rather than further away.

@headius headius modified the milestones: JRuby 9.2.8.0, JRuby 9.2.9.0 Aug 12, 2019
@colinsurprenant

This comment has been minimized.

Copy link
Contributor Author

colinsurprenant commented Aug 12, 2019

Dropped the ball on this :(
Let's try for 9.2.9.0.

@headius

This comment has been minimized.

Copy link
Member

headius commented Oct 27, 2019

@colinsurprenant Ping again. 😁

I'm still unclear on whether we should merge this or update to a different version of MRI's resolv.rb (or whether we might already have the best version due to the stdlib 2.7 update going into 9.2.9.)

@headius

This comment has been minimized.

Copy link
Member

headius commented Oct 28, 2019

Updating some details on my own...

The patch that @colinsurprenant proposes here does make our resolv.rb diverge further from the CRuby version. However the bug is caused by a separate divergence, and this just fixes that patch to work better.

If I'm understanding correctly, the change introduced in #4496 causes this to break because the service variable accessed in the free_request_id call is the version from IPAddr.new, which does not properly match the direct host version used by the call to allocate_request_id.

So I think the patch is is fine; the fix in #4496 was incomplete in that it only used the processed host along one path.

@headius headius merged commit 439aceb into jruby:master Oct 28, 2019
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jruby.jruby Build #20190429.1 succeeded
Details
headius added a commit to jruby/ruby that referenced this pull request Oct 28, 2019
This patch is from jruby/jruby#5722, which fixes some missed logic
in the patch from jruby/jruby#4496.
@jsvd jsvd mentioned this pull request Nov 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.