Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix kubernetes in-cluster CNAME lookup, proper fix? #2492

Closed
wizard580 opened this issue Jan 23, 2019 · 35 comments
Closed

fix kubernetes in-cluster CNAME lookup, proper fix? #2492

wizard580 opened this issue Jan 23, 2019 · 35 comments

Comments

@wizard580
Copy link

wizard580 commented Jan 23, 2019

1. What happened?
Intro: it's about not fully/properly fixed issue #2040 , related to the kubernetes/kubernetes#67962

What happened:

I have a cluster with CoreDNS enabled

First case: Defined this service which should resolve to ClusterIP of the kubernetes service:

kind: Service
metadata:
  name: test
  namespace: default
spec:
  externalName: kubernetes-2.default.svc.cluster.local
  type: ExternalName

dig for this services DNS name will fail to show CNAME record

Second case: Having defined resource from the first case, define another resource:

kind: Service
metadata:
  name: kubernetes-2
  namespace: default
spec:
  type: ClusterIP
  selector:
    app: some-label
  ports:
  - name: some-port-8080
    port: 80
    protocol: TCP
    targetPort: 8080

Exact service definition isn't important. We just need any normal service.

dig for this services DNS name will work returning correct A record.
dig for first services DNS name will still fail to show CNAME record.

Then, re-apply first definition, without changing anything.
And now dig will return correct CNAME record.

Looks like:

  1. CoreDNS tries to check/resolve CNAME-target and if it fails (this could be temporary issue as I can imagine) it will not add CNAME record
  2. When CNAME-target actually become available/created - we still miss CNAME

2. Which issues (if any) are related?

#2040

Why DNS server cares about content of CNAME record? This is blocker for our use cases, and K8S regression after migration from kube-dns.
Can't we skip mentioned CNAME check?

@chrisohaver
Copy link
Member

IIUC, this works as intended. I think it is an issue with timing, and cache.

In the first case, there is an external name that points to a non existing service.
This will result in a CNAME that points to a target that has no A record.
If you query that name for an A record (typically the default if you don't specify a record type), then you'll get an NXDOMAIN. This answer gets cached.
If you query for a CNAME record, then you'll get the CNAME.
This is if I'm not mistaken, standard DNS behavior with CNAMES that point to non-existing targets.

In the second case, you create the service, so the CNAME now points to an existing record.
However, the response from when it did not exist is still cached. So, you'll still get an NXDOMAIN until that cache entry expires. The default TTL is for k8s records 5 seconds.

In CoreDNS cache is more or less indexed by query name/type, and entire responses are cached. It's a per response cache, not a per record cache. So, an answer to a query for test.default.svc.cluster.local. is cached entirely independently from kubernetes-2.default.svc.cluster.local..

@wizard580
Copy link
Author

Nope, I think it's wrong.
First of all it's not about caching at all. CoreDNS restarts are not helping. All other changes are visible immediately (close to 1sec). And I gave it a time to do cache invalidation. 1h looks like enough... just in case... ;) And I did this 1h wait with querying constantly (every 2s) and 1h without touching it.

If you try changing externalName: kubernetes-2.default.svc.cluster.local to externalName: kubernetes-2.default.svc.cluster.locals (still invalid but not cluster zone) this will work. Immediately.

As far as I understand it's not the task of CNAME resource to do any kind of validations. I can say more, I didn't see behavior like this in other DNS servers.

Issue is that there are no CNAME record returned. Correctness of it - it's a different question.

To make it working I have to make K8S resources in specific order (normal service definition and only after that - service with externalName)

@chrisohaver
Copy link
Member

What version of coredns are you using?

@wizard580
Copy link
Author

The issue initially was found on 1.1.3, after googling checked with 1.2.6 + 1.3.1

@chrisohaver
Copy link
Member

I see the issue... looks easy fix... testing now...

@wizard580
Copy link
Author

Can we hope on fix to 1.1.x, 1.2.x and 1.3.x?
We will be super thankful.

@wizard580
Copy link
Author

1.1.x means backport of initial fix + this fix of the fix

@chrisohaver
Copy link
Member

I cannot reproduce the issue where you have to reapply the ExternalName service definition to get the response to appear. For me, this appears within a few seconds, as the cache entry expires.

The fix I have made fixes the issue where there is no CNAME record returned when the target is in a zone handled by the same plugin as the CNAME, and the target does not exist.

I made a recent fix (#2452) not included in a release yet, which corrects the TTL for negative responses. But this bug would have only cached the negative response for 30 seconds, not 1 hour (which is how long you waited).

@wizard580
Copy link
Author

Could it be somehow related to amount of services?

@wizard580
Copy link
Author

I misinformed you. You are right, reapply the ExternalName service definition is not needed for at least 1.2.6

@chrisohaver
Copy link
Member

reapply the ExternalName service definition is not needed for at least 1.2.6

Ah ok cool, thanks. This makes more sense now. I just need confirmation on #2493, since there are some pre-existing unit tests that validate the opposite behavior (and they start failing when I fix this).

@miekg
Copy link
Member

miekg commented Feb 11, 2019

Can someone paste some dig output to show the issue more clearly?

@wizard580
Copy link
Author

$ nslookup service-test.ps.svc.cluster.local 10.96.0.10

Server:         10.96.0.10
Address:        10.96.0.10#53

*** Can't find service-test.ps.svc.cluster.local: No answer
$ dig service-test.ps.svc.cluster.local @10.96.0.10

; <<>> DiG 9.10.3-P4-Ubuntu <<>> service-test.ps.svc.cluster.local @10.96.0.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61218
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;service-test.ps.svc.cluster.local.        IN A

;; AUTHORITY SECTION:
cluster.local.          30      IN      SOA     ns.dns.cluster.local. hostmaster.cluster.local. 1549976240 7200 1800 86400 30

;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Tue Feb 12 13:57:54 CET 2019
;; MSG SIZE  rcvd: 176

@chrisohaver
Copy link
Member

The other half of the story is that when the CNAME target is not in the same zone, the answer is different. In both this example, and the above example that @wizard580 gives, the CNAME targets do not exist.

Here the test.default.svc.cluster.local. externalName/CNAME service points to kubernetes-2.default.svc.cluster.localnot. (not same zone), and we get a CNAME with no A record as an answer...

dnstools# dig test.default.svc.cluster.local. A

; <<>> DiG 9.11.3 <<>> test.default.svc.cluster.local. A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39996
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;test.default.svc.cluster.local.	IN	A

;; ANSWER SECTION:
test.default.svc.cluster.local.	5 IN	CNAME	kubernetes-2.default.svc.cluster.localnot.

;; Query time: 2 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Tue Feb 12 13:38:38 UTC 2019
;; MSG SIZE  rcvd: 126

@aamirpinger
Copy link

I don't if its correct place to raise my issue but I m also facing issue in resolving externalName

kind: Pod
apiVersion: v1
metadata:
  name: tmp-pod
  labels:
    app: tmp-pod
spec:
 containers:
 - name: test
   image: tutum/curl:alpine
   command: ["sh","-c","sleep 90000"]
   ports:
   - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: my-ext-svc
spec:
  ports:
    - name: http
      port: 80
      targetPort: 8080
      protocol: TCP
  type: ExternalName
  externalName: google.com
  selector:
    app: tmp-pod

when I curl

curl http://my-ext-svc.default.svc.cluster.local

<title>DNS resolution error | my-ext-svc.default.svc.cluster.local | Cloudflare</title> ...

my minikube configuration is
minikube version: v0.33.1
coredns:1.2.6

minikube addons list

  • addon-manager: enabled
  • dashboard: disabled
  • default-storageclass: enabled
  • efk: disabled
  • freshpod: disabled
  • gvisor: disabled
  • heapster: disabled
  • ingress: enabled
  • kube-dns: disabled
  • metrics-server: disabled
  • nvidia-driver-installer: disabled
  • nvidia-gpu-device-plugin: disabled
  • registry: disabled
  • registry-creds: disabled
  • storage-provisioner: enabled
  • storage-provisioner-gluster: disabled

@chrisohaver
Copy link
Member

@aamirpinger, On the surface, this doesn't seem like the same issue.

Have you gone through the troubleshooting steps in https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ ?

@chrisohaver
Copy link
Member

@aamirpinger, looking at your yaml examples again, it looks like you are trying to set up an ExternalName service that selects some pods. AFAIK, this is not possible. An ExternalName service points to another service - like an alias, or in DNS terms, a CNAME. In your case, the ExternalName service points to google.com. I'm guessing that the selector would be ignored by k8s, and my-ext-svc would have no relation to tmp-pod.

@aamirpinger
Copy link

@chrisohaver thank you for the time. I have seen one of the example somewhere with the selector in externalname service so I tried if that works, now I have removed this from my svc yaml but still not working. You are right it is kind of alias for real URI. I revisited few articles on externalname service, following one is from google cloud. What I have understood is we use it to redirect to external URI so that you don't have to expose real URI in your app. so whenever u hit ur-external-svc.default it is suppose to redirect you to URI you wrote as externalname in your service.

https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-mapping-external-services

Scenario 2: Remotely hosted database with URI

If you are using a hosted database service from a third party, chances are they give you a unified resource identifier (URI) that you can use to connect to.

example, I have two MongoDB databases hosted on mLab. One of them is my dev database, and the other is production

You can create a “ExternalName” Kubernetes service, which gives you a static Kubernetes service that redirects traffic to the external service. This service does a simple CNAME redirection at the kernel level, so there is very minimal impact on your performance.

If what i m trying to do from my pods and externalname service I mentioned in my initial comment is conceptually not wrong then there must be problem with coredns (even I manually updated image of it from 1.2.6 to k8s.gcr.io/coredns:1.3.1)

I am saying this because the first step of troubleshooting from the link you shared get fails. Rest of the steps showing positive results

after creating temp busybox pod

$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10:53

** server can't find kubernetes.default: NOTIMP

*** Can't find kubernetes.default: No answer

@chrisohaver
Copy link
Member

Agreed, there is something else preventing coredns from working.

When doing the troubleshooting steps, be sure to use the exact version of busybox they suggest to use. There are bugs that prevent the nslookup kubernetes.default command from working correctly in newer versions of busybox.

If the troubleshooting guide doesn't help, please feel free to open a new issue (since it does not appear to be directly related to this issue).

@wizard580
Copy link
Author

So.. is there any chance to get it fixed? PR closed... :(

@miekg
Copy link
Member

miekg commented Feb 17, 2019

That PR was not the correct fix

@chrisohaver
Copy link
Member

chrisohaver commented Feb 19, 2019

So.. is there any chance to get it fixed? PR closed... :(

The PR, which allowed returning of un-terminated CNAMES (CNAMES with targets that dont have an A record), also allowed it to return CNAME loops (CNAME loops also being un-terminated). So, that fix may need additional safeguards in place to prevent it from returning CNAME loops to the client. Rather than take a garbage-in-garbage-out approach, I think @miekg would like for CoreDNS to filter out loops to protect the client.

@miekg, CNAME loops aside, what is the correct DNS behavior for A queries to a CNAME with a target that doesn't have an A record: should we return NODATA, or a CNAME without an A record?

@wizard580
Copy link
Author

wizard580 commented Feb 19, 2019

My vote goes for CNAME without an A record
We asking for CNAME (Service with externalName) and we should get it.

@miekg
Copy link
Member

miekg commented Feb 19, 2019 via email

@chrisohaver
Copy link
Member

Cool, thanks. I'll re-open #2493, and figure out how to filter CNAME loops form the response.

@miekg
Copy link
Member

miekg commented Feb 19, 2019 via email

@chrisohaver
Copy link
Member

Why is this cname there?

In k8s, Services of type ExternalName are represented as CNAMEs in the cluster DNS (per spec).

And why isn't there a A or AAAA to complete it?

In this example, The ExternalName Service points to a target that doesn't have an A record, because the target Service doesn't exist until the target Service is created.

@miekg
Copy link
Member

miekg commented Feb 19, 2019 via email

@wizard580
Copy link
Author

We are not really talking about K8s+DNS part. We are talking just about DNS.

You configured CNAME record, and expect to get it. I don't see why we should try to be smarter than needed, do 'some optimizations' in a place where nobody expects or asked.

Please show me Godaddy/AWS Route53/Bind/Unbound/Whatever who have this behavior. Maybe I missed that?

@wizard580
Copy link
Author

Sorry, is there any progress?

@miekg
Copy link
Member

miekg commented Mar 23, 2019

You configured CNAME record, and expect to get it. I don't see why we should try to be smarter than needed, do 'some optimizations' in a place where nobody expects or asked.

But why? What possible good does this to a client??

@wizard580
Copy link
Author

Do what's service was asked to do? Isn't it enough?
I like examples, so I'll proceed with them...

You have symlink in, let's say ext4 filesystem (good example, isn't it? matching?).
Please answer few questions:

  • ln -s not-existent-file symlink-file
    what's expected result of it? real and visible to 'user'.

Your way: do something 'internally' but don't show anything until not-existent-file will not appear. Let user guess is something wrong or it's symlink of Schrodinger. What possible good does this to a client??

How it should be, imho: show symlink pointing to some other file, which may not exist yet. But user will see that symlink is there, will see where it points to. It's users task to ensure that destination is reachable in any meaning.

  • Let's assume that not-existent-file actually was there initially, we saw symlink with ls -la command etc. What should happen when not-existent-file will disappear? Should symlink disappear (be hidden actually) as well? What possible good does this to a client??

I just expect service do what it was instructed to do. Not more, not less.

@miekg
Copy link
Member

miekg commented Mar 24, 2019 via email

@chrisohaver
Copy link
Member

It's a transient state that occurs when an ExternalName service points to a Service that is not created yet.

IMO, this is academic, and while technically it should be fixed for consistency sake (currently CoreDNS behaves one way when the CNAME non-existing target is in another zone, and another when its in the same zone), I don't suspect it's really hurting anyone. So if there is any resistance to fix it, I'm OK with leaving this harmless bug unfixed.

One could argue that if this behavior is a problem, an affected user could work around it by creating the Service before creating the ExternalName Service.

@miekg
Copy link
Member

miekg commented Mar 26, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants