New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c-ares 1.22.1 (and newer) does not change servers on a request timeout #650
Comments
Hmm, that seems really bad if true. I thought the test cases handled the retry scenarios and would switch servers. Let me investigate. |
ok, confirmed. Created PR #651 which also adds a test case for this. Once all CI/CD tests complete I'll merge into main as well as the 1.23 branch. |
bradh352
added a commit
that referenced
this issue
Dec 7, 2023
As of c-ares 1.22.0, server timeouts were erroneously not incrementing server failures meaning the server in use wouldn't rotate. There was apparently never a test case for this condition. This PR fixes the bug and adds a test case to ensure it behaves properly. Fixes Bug: #650 Fix By: Brad House (@bradh352)
bradh352
added a commit
that referenced
this issue
Dec 7, 2023
As of c-ares 1.22.0, server timeouts were erroneously not incrementing server failures meaning the server in use wouldn't rotate. There was apparently never a test case for this condition. This PR fixes the bug and adds a test case to ensure it behaves properly. Fixes Bug: #650 Fix By: Brad House (@bradh352)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If I understand correctly, with the new dynamic server list behaviour of #594, the first server in the server list is always used by
ares__requeue_query()
. But, when a query times out inprocess_timeouts()
, there is no call toserver_increment_failure()
, so the server list always stays in the same order, and thus the retry is send to the same server.This means that when a DNS server is not responding, c-ares gets stuck querying it forever, at least in all the cases I've seen. Real users often have misconfigurations such as non-functional IPv6 DNS server entries, etc. If a "bad" server is first in the list, no DNS lookups work at all.
All other (similar) calls to
ares__requeue_query()
are preceded by a call toserver_increment_failure()
. Adding that call inprocess_timeouts()
immediately beforeares__requeue_query()
locally fixed the behaviour in my testing so far.The text was updated successfully, but these errors were encountered: