Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredns has caching plugin installed which causes non-authoritative responses most of the time #1512

Open
joejulian opened this issue Apr 18, 2019 · 10 comments

Comments

Projects
None yet
6 participants
@joejulian
Copy link

commented Apr 18, 2019

What keywords did you search in kubeadm issues before filing this one?

coredns cache

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):
kubeadm version: &version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:51:21Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: aws
  • OS (e.g. from /etc/os-release):
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Kernel (e.g. uname -a): Linux ip-10-0-2-199.us-west-2.compute.internal 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Others:

What happened?

The coredns configmap is:

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           upstream
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2019-04-17T21:48:42Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "244"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: 9164b9b9-615a-11e9-b7a6-0a76de0932ee

What you expected to happen?

When querying a service dns name, I expect the result to be authoritative ("aa"), ie:

;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

This is a successful query as expected:

; <<>> DiG 9.12.3-P4 <<>> +search kube-dns
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2917
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 913282e2b788674b (echoed)
;; QUESTION SECTION:
;kube-dns.kube-system.svc.cluster.local.	IN A

;; ANSWER SECTION:
kube-dns.kube-system.svc.cluster.local.	5 IN A	10.0.0.10

;; Query time: 0 msec
;; SERVER: 10.0.0.10#53(10.0.0.10)
;; WHEN: Thu Apr 18 03:24:31 UTC 2019
;; MSG SIZE  rcvd: 133

This is an unsuccessful query:

; <<>> DiG 9.12.3-P4 <<>> +search kube-dns
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3084
;; flags: qr rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 2ac1a88733f8227d (echoed)
;; QUESTION SECTION:
;kube-dns.kube-system.svc.cluster.local.	IN A

;; ANSWER SECTION:
kube-dns.kube-system.svc.cluster.local.	3 IN A	10.0.0.10

;; Query time: 0 msec
;; SERVER: 10.0.0.10#53(10.0.0.10)
;; WHEN: Thu Apr 18 03:24:47 UTC 2019
;; MSG SIZE  rcvd: 133

This is unsuccessful because it's not in the first 1 second of being updated. Once the ttl of the entry is less than the min ttl, it is, by definition, no longer authoritative. This is because it's being served from the cache instead of from the authoritative domain entry.

How to reproduce it (as minimally and precisely as possible)?

Install a cluster.

kubectl --generator=run-pod/v1 -n kube-system run tmp --rm -it --image alpine -- /bin/sh -c 'apk update && apk add bind-tools && sh'
# dig kube-dns.kube-system.svc.cluster.local. ; sleep 1; dig kube-dns.kube-system.svc.cluster.local.

The first query will be authoritative because that query populates the cache. One second later, the next query is served from cache and not authoritative.

Anything else we need to know?

This problem was reported to me as affecting some of our customer's software that's written in python that will just fail if the dns response is not authoritative.

Removing the cache 30 line from the ConfigMap resolves this problem. Caching should not be necessary unless mirroring a high latency remote zone.

@joejulian

This comment has been minimized.

Copy link
Author

commented Apr 18, 2019

/sig network

@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

@joejulian

Removing the cache 30 line from the ConfigMap resolves this problem. Caching should not be necessary unless mirroring a high latency remote zone.

cc @chrisohaver @rajansandeep

what is your take on the cache 30 value and this use case?

@chrisohaver

This comment has been minimized.

Copy link

commented Apr 18, 2019

The cache plugin in the default Corefile is used to reduce traffic to the upstream DNS.
As the default Corefile is structured, it also happens to cache k8s responses (for 5 seconds, the default TTL for the kubernetes plugin).
There is no significant performance benefit for CoreDNS caching the k8s responses because the kubernetes plugin more less keeps its own cache (part of the k8s client-go api watch).

If you don't want kubernetes records to be cached, you have a couple of options, each has possible drawbacks:

  1. You can set the TTL for kubernetes records to zero. This will prevent them from entering the cache. In theory this could confuse clients that look at the TTL, but I think TTL is usually ignored by DNS clients.
  2. You can remove the cache plugin. This would result in increased traffic, and higher average latency for queries of external zones.

FWIW, I recall a recent issue opened requesting that the we cache kubernetes records for longer than 5 seconds. It's hard to pick a default value that suits everyone.

@joejulian

This comment has been minimized.

Copy link
Author

commented Apr 18, 2019

That would be true if kubeadm allowed the Corefile to be configured, but that's been brought up before, too, and rejected as too difficult to maintain during the upgrade process.

@joejulian

This comment has been minimized.

Copy link
Author

commented Apr 18, 2019

My view on this is that caching entries breaks existing user software and not caching increases latency. One's breaking the other's inconvenient. IMHO, the breakage should take precedent and optimizing for use case should be the responsibility of the cluster maintainer.

@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

That would be true if kubeadm allowed the Corefile to be configured, but that's been brought up before, too, and rejected as too difficult to maintain during the upgrade process.

that is true, the umbrella ticket for allowing such customization of kubeadm generated manifests is here:
#1379

if the coredns maintainers give their +1 on modifying the default corefile in kubeadm we can proceed to change it, otherwise this ticket should be closed and mentioned in a comment in the above ticket - e.g. "allow customization of the CoreDNS deployment".

is modifying the coredns config map of a running cluster and restarting the pods a viable, immediate solution for you?

@joejulian

This comment has been minimized.

Copy link
Author

commented Apr 18, 2019

That is our immediate workaround, yes.

@chrisohaver

This comment has been minimized.

Copy link

commented Apr 18, 2019

IMHO, the breakage should take precedent and optimizing

It depends on how wide the breakage is. I don't think it common for clients to reject non-authoritative responses - but I'm not a DNS expert. Is this a python wide thing or is it something specific to your customer's application?

If this is something that is fairly common, then we should accommodate it in the default config. If it turns out to be an unusual special case, then probably not.

@samba

This comment has been minimized.

Copy link

commented Apr 18, 2019

@chrisohaver it appears to be common to Python, via the socket.getaddrinfo call, if I'm reading correctly, in cases like this:
https://stackoverflow.com/questions/54778160/python-requests-library-not-resolving-non-authoritative-dns-lookups

@chrisohaver

This comment has been minimized.

Copy link

commented Apr 18, 2019

In CoreDNS you can disable cache so all local cluster zone responses will be authoritative. But it wont change responses from upstream servers. They would mostly be non-authoritative, retrieved from the cache of intermediate recursive servers. This is normal, and thus confounding that Python should only be able to resolve names directly from authoritative servers.

I just sanity checked this on a k8s cluster running CoreDNS with cache enabled: In my test, Python3 (3.6.5) seems to be fine with non aa responses from CoreDNS.

>>> socket.getaddrinfo("kubernetes.default.svc.cluster.local.", 0, 0, 0, socket.SOCK_STREAM)
[(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, 'kubernetes.default.svc.cluster.local.', ('10.96.0.1', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_DGRAM: 2>, 17, 'kubernetes.default.svc.cluster.local.', ('10.96.0.1', 0))]
>>> socket.getaddrinfo("kubernetes.default.svc.cluster.local.", 0, 0, 0, socket.SOCK_STREAM)
[(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, 'kubernetes.default.svc.cluster.local.', ('10.96.0.1', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_DGRAM: 2>, 17, 'kubernetes.default.svc.cluster.local.', ('10.96.0.1', 0))]

... and the CoreDNS logs ...

2019-04-18T21:14:59.987Z [INFO] 172.17.0.4:36730 - 37402 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.00078162s
2019-04-18T21:15:01.179Z [INFO] 172.17.0.4:39902 - 2144 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,rd 106 0.000080777s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.