Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-dns never resolves if a domain returns NOERROR with 0 answer records once #121

Closed
ahmetb opened this issue Jun 29, 2017 · 22 comments · Fixed by kubernetes/kubernetes#53604
Assignees

Comments

@ahmetb
Copy link
Member

ahmetb commented Jun 29, 2017

tl;dr If a nameserver replies status=NOERROR with no answer section to a DNS A question, kube-dns always caches this result. If the domain name actually gets an A record after it's queried through kube-dns, it never (I waited a few days) resolves from the pods, but does resolve outside the container (e.g. on my laptop) just fine.

Repro steps

Prerequisites

  • Have a domain name alp.im and the nameservers are pointed to CloudFlare.
  • Have nslookup/dig installed on your workstation.
  • Have a minikube cluster ready on your workstation
    • running kubernetes v1.6.0
    • kube-dns comes by default, running gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1

Step 1: Domain does not exist, query from your laptop

Note ANSWER: 0, and status: NOERROR

$ dig A z.alp.im

; <<>> DiG 9.8.3-P1 <<>> A z.alp.im
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64978
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;z.alp.im.			IN	A

;; AUTHORITY SECTION:
alp.im.			1799	IN	SOA	ivan.ns.cloudflare.com. dns.cloudflare.com. 2025042470 10000 2400 604800 3600

;; Query time: 196 msec
;; SERVER: 2401:fa00:fa::1#53(2401:fa00:fa::1)
;; WHEN: Thu Jun 29 10:51:35 2017
;; MSG SIZE  rcvd: 99

Step 2: Domain does not exist, query from Pod on Kubernetes

Start a toolbelt/dig container with shell and run the same query:

⚠️ Do not exit this container as you will reuse it later.

Note the response is the same, ANSWER: 0 and NOERROR.

$ kubectl run -i -t --rm --image=toolbelt/dig dig --command -- sh
If you don't see a command prompt, try pressing enter.
/ # dig A z.alp.im

; <<>> DiG 9.11.1-P1 <<>> A z.alp.im
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11209
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;z.alp.im.			IN	A

;; AUTHORITY SECTION:
alp.im.			1724	IN	SOA	ivan.ns.cloudflare.com. dns.cloudflare.com. 2025042470 10000 2400 604800 3600

;; Query time: 74 msec
;; SERVER: 10.0.0.10#53(10.0.0.10)
;; WHEN: Thu Jun 29 17:55:46 UTC 2017
;; MSG SIZE  rcvd: 99

(Also note that SERVER: 10.0.0.10#53 which is kube-dns.)

Step 3: Create an A record for the domain

Here I use CloudFlare as it manages my DNS.

image

Step 4: Test DNS record from your laptop

Run dig on your laptop (note ;; ANSWER SECTION: and 8.8.8.8 answer):

$ dig A z.alp.im

; <<>> DiG 9.8.3-P1 <<>> A z.alp.im
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37570
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;z.alp.im.			IN	A

;; ANSWER SECTION:
z.alp.im.		299	IN	A	8.8.8.8

;; Query time: 196 msec
;; SERVER: 2401:fa00:fa::1#53(2401:fa00:fa::1)
;; WHEN: Thu Jun 29 10:54:44 2017
;; MSG SIZE  rcvd: 53

Step 5: Test DNS record from Pod on Kubernetes

Run the same command again:

/ # dig A z.alp.im

; <<>> DiG 9.11.1-P1 <<>> A z.alp.im
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45420
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;z.alp.im.			IN	A

;; Query time: 0 msec
;; SERVER: 10.0.0.10#53(10.0.0.10)
;; WHEN: Thu Jun 29 18:00:24 UTC 2017
;; MSG SIZE  rcvd: 37

Note the diff:

  • still ANSWER: 0 and status: NOERROR (but it resolves just fine outside the cluster)
  • ;; AUTHORITY SECTION: disappeared and AUTHORITY: changed to 0 from the previous time we ran this.
  • ;; Query time: 0 msec (was 79 ms) –I assume it means it's just a cached response.
    • Query time stays as 0 ms no matter how many times I run the same command.

What else I tried

  • Try it on GKE: I tried with k8s v1.5.x and v1.6.4. → Same issue. (cc: @bowei)

  • Query from a different pod on minikube: I started a new Pod and queried from there → Same issue.

  • Restart kube-dns Pod → This worked on GKE, but not on minikube.

    $ kubectl delete pods -n kube-system -l k8s-app=kube-dns
    pod "kube-dns-268032401-69xk5" deleted
    

Impact

I am not sure why this has not been discovered before. I noticed this behavior while using kube-lego on GKE. Once kube-lego applies for a TLS certificate, it polls the domain name of the service (e.g. example.com/.well-known/<token>) before asking Let's Encrypt to validate it. Before I create an Ingress with kube-lego annotation, I don't have the external IP yet so I can't configure the domain, but the kube-lego Pod already picks it up and starts querying my domain in an infinite loop. It never succeeds because first time it looked up the hostname, the A record didn't exist, so that result is cached forever. After I add A record, it still can't resolve. The moment I delete kube-dns Pods and they get recreated, it immediately starts working and resolves the hostname and completes the kube-lego challenge.

@bowei
Copy link
Member

bowei commented Jun 29, 2017

@ahmetb I think it's legal in DNS to cache if we get a rcode == 0 response with 0 entries. This looks to be the behavior of the Cloudflare server (sending rcode 0 instead of NXDOMAIN). It looks like the TTL was around 30 minutes for the DNS record. If the record is going to be changing, it would be advisable to reduce the TTL to get faster cache updates.

@ahmetb
Copy link
Member Author

ahmetb commented Jun 29, 2017

@bowei I think @viglesiasce reproduced this with Google Domains (or Cloud DNS) too.

In my experience, the cache was not invalidated even after 24 hours when I left it at that.

@jpap
Copy link

jpap commented Jul 28, 2017

Linking to #119 with respect to Cloudflare.

@vavrusa
Copy link

vavrusa commented Jul 30, 2017

RCODE=0 with no response is the NODATA pseudo-rcode. For the purpose of caching, it shouldn't be treated differently from NXDOMAIN with one exception - it doesn't say anything about non-existence of names below the requested name. See https://tools.ietf.org/html/rfc2308#section-2.2 for guideline. It's possibly related to miekg/dns#428

@ahmetb
Copy link
Member Author

ahmetb commented Aug 1, 2017

I reported this to various folks at CloudFlare, still waiting a response. However, if anyone can help pinpoint where the caching happens, under what circumstances and why it lasts so long (i.e. >24h or in my experience, indefinitely), those would help fixing this problem, too.

@vavrusa
Copy link

vavrusa commented Aug 1, 2017

I work at Cloudflare, so I'm happy to answer any questions. It's not however specific to Cloudflare DNS; NODATA is a kind of answer you get from an authoritative server when the requested name exists, but the record type you're looking for doesn't, which is quite common. The RFC2038 I linked provides a guideline on how clients should implement negative caching for all cases of negative answers - hope that helps.

@ahmetb
Copy link
Member Author

ahmetb commented Aug 1, 2017

Got an answer from the CloudFlare support:

We are aware of this behavior and it has been escalated previously to our DNS team. Their response is that at this moment we could not change/improve this behaviour. It's our design feature.

We are aware of this and will be working in improving the behavior in the future - but this will not happen earlier than 6 months.

Unfortunately, at this point, this is how our DNS is working.

We should look at fixing the caching behavior in kube-dns (or miekg/dns, or wherever it is) as a mitigation. Not caching 0-record answers sounds like it would yield a low-impact cache-miss rate to me. @bowei thoughts?

@bowei
Copy link
Member

bowei commented Aug 1, 2017

@ahmetb caching is done with dnsmasq (http://www.thekelleys.org.uk/dnsmasq/doc.html) with no special tuneables. Maybe there is a flag that can disable caching that response? I'm surprised this does not impact more people, not just users of Kubernetes. dnsmasq is a popular piece of software, standard resolver on some Linux distros.

@hugorut
Copy link

hugorut commented Aug 2, 2017

I have this exact same issue (using kube-lego on GCE) but using google cloud DNS. Kube lego cannot resolve my domain when trying to request the token in order to issue a certificate. External to any kube pod the domain name resolves fine. When digging the domain within the pod it still gets ANSWER: 0 and status: NOERROR.

I tried restarting the kube-dns with

kubectl delete pods -n kube-system -l k8s-app=kube-dns

but to no avail.

Is there anything I can do to expedite invalidating the DNS cache or is it a matter of waiting it out? (It's been close to 24h for me)

@bowei
Copy link
Member

bowei commented Aug 2, 2017

Can you post the output of dig for the entry? dnsmasq uses the TTL of the SOA record for negative replies, otherwise it will be 0.

@hugorut
Copy link

hugorut commented Aug 2, 2017

From the pod:

; <<>> DiG 9.10.4-P8 <<>> my.domain.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 34318
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;my.domain.com			IN	A

;; Query time: 0 msec
;; SERVER: 10.39.240.10#53(10.39.240.10)
;; WHEN: Wed Aug 02 06:45:50 UTC 2017
;; MSG SIZE  rcvd: 43

but i will occasionally get this answer instead:

; <<>> DiG 9.10.4-P8 <<>> my.domain.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 18668
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;my.domain.com.			IN	A

;; AUTHORITY SECTION:
my.domain.com		895	IN	SOA	ns-1660.awsdns-15.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

;; Query time: 1 msec
;; SERVER: 10.39.240.10#53(10.39.240.10)
;; WHEN: Wed Aug 02 06:35:03 UTC 2017
;; MSG SIZE  rcvd: 130

This SOA is from AWS which was my prior DNS provider.

@bowei bowei self-assigned this Aug 2, 2017
@bowei
Copy link
Member

bowei commented Aug 2, 2017

I will try playing with dnsmasq flags and see if we can change its negative caching behavior.

@ahmetb
Copy link
Member Author

ahmetb commented Aug 4, 2017

@bowei Any luck?

@coresolve
Copy link

Looks like - --no-negcache added to the dnsmasq args ought to do it.

Credit to https://rsmitty.github.io/KubeDNS-Tweaks/

@ahmetb
Copy link
Member Author

ahmetb commented Oct 4, 2017

@coresolve whoa this is amazing. @bowei do you think it's sensible to incorporate this as a default in kube-dns distribution?

@bowei
Copy link
Member

bowei commented Oct 4, 2017

yes, since we don't enable neg caching

@cblecker
Copy link
Member

cblecker commented Oct 4, 2017

@bowei
Copy link
Member

bowei commented Oct 9, 2017

yes, that should be a one-line change to the yaml

@cblecker
Copy link
Member

cblecker commented Oct 9, 2017

Opened kubernetes/kubernetes#53604 to add this

@miekg
Copy link

miekg commented Oct 10, 2017

Has anyone looked into the impact of removing negative caching on the volume of DNS requests that now need resolved again and again?

@ahmetb
Copy link
Member Author

ahmetb commented Oct 10, 2017

@miekg I think we don't know what will this change break. However, unless changed, many software that rely on domains eventually resolving stays broken. I'm not sure if we have enough tools to answer this question properly.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Oct 13, 2017
Automatic merge from submit-queue (batch tested with PRs 53604, 53751). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add no-negcache flag to kube-dns

**What this PR does / why we need it**:
Adds the [`--no-negcache`](https://linux.die.net/man/8/dnsmasq) flag to prevent dnsmasq from caching negative (NXDOMAIN) responses. More details on why this is desirable [here](kubernetes/dns#121).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes/dns#121

**Special notes for your reviewer**:
Thanks to @rsmitty (https://rsmitty.github.io/KubeDNS-Tweaks/) and @coresolve (kubernetes/dns#121 (comment)) for pointing us in the right direction.

**Release note**:
```release-note
Add --no-negcache flag to kube-dns to prevent caching of NXDOMAIN responses.
```
@krogon
Copy link

krogon commented Feb 23, 2018

Why did we disable neg-caching as default instead of setting reasonable TTL value with --neg-ttl=600?
With huge amount of queries in kubernetes related to ndots settings this would have negative impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants