Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.0 backport] libnetwork: just forward the external DNS response #45573

Merged
merged 1 commit into from May 18, 2023

Conversation

thaJeztah
Copy link
Member

Changes to the DNS resolver in #44664 fixed some bugs, including load-bearing bugs which masked other bugs. One bug not fixed by the aforementioned PR is that the resolver handles non-authoritative NXDOMAIN responses by failing over to the next external DNS server in the list. That behaviour does not make sense as authoritative responses only come from the authoritative name servers for a domain, and most of the time none of the servers in the list will be authoritative. What would make sense for handling a non-authoritative NXDOMAIN is to recursively resolve up to the authoritative name servers to get an authoritative answer to cache. But our DNS resolver is a stub resolver, not a recursive resolver, so there is no good reason for it to treat any NXDOMAIN response differently from a NOERROR response. (If you depended on this fail-over behaviour to successfully resolve domains because the first external DNS server in your list returns NXDOMAIN for the public internet, that DNS server is broken and needs to be fixed. If we can make a split-horizon DNS resolver which resolves public domain names by forwarding, so can you.)

Commit 9cf8c4f fixed a bug in the fail-over behaviour by replying SERVFAIL if none of the external name servers provided a successful response instead of falling through to forward the last non-successful response received. Unfortunately that was a load-bearing bug which masked the aforementioned NXDOMAIN issue.

- What I did
Our resolver is just a forwarder for external DNS—a stub resolver—so it should act like it. Unless it's a server failure or refusal, take the response at face value and forward it along to the client. RFC 8020 is only applicable to caching recursive name servers and our resolver is neither caching nor recursive.

- How I did it
Deleted code which had no business existing in a DNS stub resolver.

- How to verify it

$ docker network create mybridge
$ docker run --rm -t --network mybridge --dns 8.8.8.8 alpine sh -c 'apk add -qU bind-tools && dig nxdomain.test'

v24.0.0:

; <<>> DiG 9.18.14 <<>> nxdomain.test
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 45178
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;nxdomain.test.			IN	A

;; Query time: 4 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Thu May 18 14:12:18 UTC 2023
;; MSG SIZE  rcvd: 31

This PR:

; <<>> DiG 9.18.14 <<>> nxdomain.test
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 34098
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;nxdomain.test.			IN	A

;; AUTHORITY SECTION:
.			86398	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2023050900 1800 900 604800 86400

;; Query time: 6 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Thu May 18 14:15:05 UTC 2023
;; MSG SIZE  rcvd: 117

- Description for the changelog

  • Fixed an issue where DNS query NXDOMAIN replies from external servers were forwarded to the client as SERVFAIL

- A picture of a cute animal (not mandatory but encouraged)

Our resolver is just a forwarder for external DNS so it should act like
it. Unless it's a server failure or refusal, take the response at face
value and forward it along to the client. RFC 8020 is only applicable to
caching recursive name servers and our resolver is neither caching nor
recursive.

Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 4135622)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah merged commit 0e605cf into moby:24.0 May 18, 2023
94 checks passed
@thaJeztah thaJeztah deleted the 24.0_backport_fix_dns_servfail branch May 18, 2023 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants