Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Never-ending getaddress() call when using compressed IPv6 nameservers in /etc/resolv.conf #3663
I managed to reproduce the following bug in a CentOS 7.2 box with only IPv6 nameservers and dual stack, jRuby 220.127.116.11 and java-1.7.0-openjdk-18.104.22.168-22.214.171.124.el7_2. The same code works fine on MRI 2.0.0. Perhaps this has already been fixed in newer versions, as to me the bug looks too obvious to be still alive. Anyway, here we go:
The following program and the resolv.conf above makes the interpreter hang for a long time and return a failure:
require 'resolv' puts Resolv.getaddress 'web.cern.ch'
These are the last lines in a debugging session before the first exception is raised. The program blocks in the select() call until it times out and ResolvTimeout is raised.
I think that this is happening because during the previous iteration this statement was evaluated as false because
was nil. Why was it nil? Because the search keys didn't match, as 'from' contains the uncompressed flavor of the IPv6 address of the DNS server, whereas 'senders' has the compacted form (presumably coming from /etc/resolv.conf):
This situation leads to the outer loop not stopping (see "unexpected DNS message ignored"), therefore the program executes IO.select again but there's nothing to read anymore so ResolvTimeout is raised. This exception is probably caught by the caller further up and at some point .request is called again, leading to another loop. This situation repeats many times after a few minutes the call returns a ResolvError all the way up back to the user.
Handcrafting resolv.conf so all addreses are expanded there makes resolv.rb happy:
This way the program quickly exits without hanging:
Configuring a Resolv::DNS object by hand with a compressed IPv6 address seems to work.
I can reproduce the problem using the latest jRuby 1.x:
On the other hand:
The program exits normally.
Same as above but stopping to inspect:
referenced this issue
Feb 13, 2016
And the same debugging process but using MRI (which works with compacted addresses):
(It's getting quite amusing to speak to myself :))
It's also interesting to observe that the problem is only triggered when more than one (compressed) IPv6 nameservers are available and therefore UnconnectedUDP is used. I realized about this after trying to reproduce the problem by creating Resolv::DNS objects with manual configuration aiming to take resolv.conf out of the equation.
So first test case, two IPv6 nameservers with compressed addresses. As shown in previous comments this should trigger the bug and therefore fail.
Now only one compressed nameserver, note that Resolv::DNS::Requester::ConnectedUDP is used instead:
Finally, the last test case for completeness' sake: more than one IPv6 nameserver without compressed addresses. This should also work as seen before (when I expanded the addresses in resolv.conf by hand).
This could be a candidate for a patch, so the responsibility to compare addresses is delegated to IPAddr (instead of merely comparing strings):
--- lib/ruby/1.9/resolv.rb.orig 2016-02-14 12:16:06.916368212 +0100 +++ lib/ruby/1.9/resolv.rb 2016-02-14 12:30:17.925784003 +0100 @@ -2,6 +2,7 @@ require 'fcntl' require 'timeout' require 'thread' +require 'ipaddr' begin require 'securerandom' @@ -728,13 +729,13 @@ def recv_reply(readable_socks) reply, from = readable_socks.recvfrom(UDPSize) - return reply, [from,from] + return reply, [IPAddr.new(from),from] end def sender(msg, data, host, port=Port) sock = @socks_hash[host.index(':') ? "::" : "0.0.0.0"] return nil if !sock - service = [host, port] + service = [IPAddr.new(host), port] id = DNS.allocate_request_id(host, port) request = msg.encode request[0,2] = [id].pack('n')
This way it does not matter if the address returned by recvfrom is compressed or expanded, as the one it has to be compared to is also an IPAddr object.
BTW, the IPv6 nameservers used above for all the examples are not accesible from the Internet so expect a ResolvTimeout to be raised anyway, masking the bug. It should be easy to reproduce the problem though using Google's public nameservers (2001:4860:4860::8888 and 2001:4860:4860::8844).
Now using Google's nameservers. Slightly different test environment though (OSX, Java8) but same results:
And unpatched again but with uncompressed addresses:
Ok, this got backburnered for a while, but I'm not quite sure what to do at this point. I need to know...
Sorry this got left for so long!
Sorry, I missed your update :(
It does not, as stated on the third comment.
It is I believe, however I selfishly only care about 1.7 which is what we have to run as part of Puppetserer 1.x :)
The way to reproduce the bug is explained above in detail, you need:
Demos (using the latest 9k that I could find):
MRI, compacted IPv6 nameservers: it works.
jRuby 126.96.36.199, compacted nameservers, it hangs:
jRuby 188.8.131.52, expanded nameservers, it works:
And for completeness, jRuby 1.7.26 and compacted nameservers, it hangs:
Those nameservers cannot be used from the Internet, however this should be reproducible using Google's public IPv6 nameservers: 2001:4860:4860::8888 and 2001:4860:4860::8844.
Hope this helps.
added a commit
Feb 19, 2017
referenced this issue
Feb 19, 2017
And yes, this is still an issue in jRuby 184.108.40.206.
I commented on #4496 but I'll mention again here: I'd be more comfortable patching standard library if we could prove MRI exhibits the same problem.
I also apologize for missing all your activity! Much of your self-conversation here took place while @enebo and I were at FOSDEM and not monitoring issues as closely. We'll get this wrapped up for 220.127.116.11.