New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS resolver locking up entire MRI process #2175
Comments
If you look at the methods that are patched by |
Ok, but why would that block all threads on a modern MRI version? |
If I call |
I did this: require 'thread'
require 'rubydns'
class MyServer < RubyDNS::Server
def process(name, resource_class, transaction)
sleep 3
end
end
RubyDNS.run_server(asynchronous: true, server_class: MyServer)
t = Thread.new do
while true
print '.'
sleep 1
end
end
Resolv.getaddress "fake.host"
t.join And updated the local resolv.conf to use 127.0.0.1. I got back NXDOMAIN after 3 seconds, just as expected, with 3 dots printed to screen. |
What about the |
There's no such method. |
> require 'socket'
=> true
> TCPSocket.connect
NoMethodError: undefined method `connect' for TCPSocket:Class |
I can totally buy that a native gem like mysql2 or pg might call into their native client libraries which might perform a native DNS lookup without releasing the GIL. Is that the scenario you are referring to? |
Er, sorry, I meant As for native gems, if that were the cause, using |
I'm still struggling to put together sample code which actually shows a full process lockup. Making a bad request with Typhoeus and Curb both work fine. TCPSocket.new works fine. Socket.gethostbyname works fine. I wonder if it is platform-dependent? I'm using Ruby 2.2 on 14.04LTS. |
We just had a half-hour worker brown-out, shortly after upgrading from Ruby 2.7.2 to 2.7.3, caused entirely by hundreds of errors from sidekiq workers, which were all occurring in During investigation I found this: https://bugs.ruby-lang.org/issues/17781 which includes a repro that works on 2.7.3, 3.0.0, and 3.0.1 (but not 2.7.2).
I found this issue while looking to figure out a Question 1: Should I use
SOLUTION: Remove
Question 2: How do I fix the freeze caused by the bug in SOLUTION: Switch to pulling in Gemfile:
and then:
Confirmed that does patch the bug in Ruby 2.7.3:
|
@pboling Hi there, first off, thanks for the info! For context, I'm coming to this issue after debugging a deadlocking issue described best by New Relic's docs in which I intend to use I'm a little confused by your rationale. I'm reading your suggestion as "don't use Are you using |
Correct. The way Ruby ships with "bundled" gems now is far different than what it used to be. A new version of the When you specify the gem So to get a fixed version of the standard If that doesn't fix it, then it does sound like using the pure Ruby Just be aware of the downsides... (which I gleaned from third party sources all over the internet, not from directly understanding the source code, so could be wrong): • no IPv6 support, |
Several customers have reported Sidekiq "freezing" over the last 3-6 months. When I had them put
require 'resolv-replace'
in their initializer, the problem went away. What might cause this?The text was updated successfully, but these errors were encountered: