-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix kubernetes in-cluster CNAME lookup, proper fix? #2492
Comments
IIUC, this works as intended. I think it is an issue with timing, and cache. In the first case, there is an external name that points to a non existing service. In the second case, you create the service, so the In CoreDNS cache is more or less indexed by query name/type, and entire responses are cached. It's a per response cache, not a per record cache. So, an answer to a query for |
Nope, I think it's wrong. If you try changing As far as I understand it's not the task of CNAME resource to do any kind of validations. I can say more, I didn't see behavior like this in other DNS servers. Issue is that there are no CNAME record returned. Correctness of it - it's a different question. To make it working I have to make K8S resources in specific order (normal service definition and only after that - service with externalName) |
What version of coredns are you using? |
The issue initially was found on 1.1.3, after googling checked with 1.2.6 + 1.3.1 |
I see the issue... looks easy fix... testing now... |
Can we hope on fix to 1.1.x, 1.2.x and 1.3.x? |
1.1.x means backport of initial fix + this fix of the fix |
I cannot reproduce the issue where you have to reapply the ExternalName service definition to get the response to appear. For me, this appears within a few seconds, as the cache entry expires. The fix I have made fixes the issue where there is no CNAME record returned when the target is in a zone handled by the same plugin as the CNAME, and the target does not exist. I made a recent fix (#2452) not included in a release yet, which corrects the TTL for negative responses. But this bug would have only cached the negative response for 30 seconds, not 1 hour (which is how long you waited). |
Could it be somehow related to amount of services? |
I misinformed you. You are right, |
Ah ok cool, thanks. This makes more sense now. I just need confirmation on #2493, since there are some pre-existing unit tests that validate the opposite behavior (and they start failing when I fix this). |
Can someone paste some dig output to show the issue more clearly? |
|
The other half of the story is that when the CNAME target is not in the same zone, the answer is different. In both this example, and the above example that @wizard580 gives, the CNAME targets do not exist. Here the
|
I don't if its correct place to raise my issue but I m also facing issue in resolving externalName
when I curl <title>DNS resolution error | my-ext-svc.default.svc.cluster.local | Cloudflare</title> ... my minikube configuration is minikube addons list
|
@aamirpinger, On the surface, this doesn't seem like the same issue. Have you gone through the troubleshooting steps in https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ ? |
@aamirpinger, looking at your yaml examples again, it looks like you are trying to set up an ExternalName service that selects some pods. AFAIK, this is not possible. An ExternalName service points to another service - like an alias, or in DNS terms, a CNAME. In your case, the ExternalName service points to |
@chrisohaver thank you for the time. I have seen one of the example somewhere with the selector in externalname service so I tried if that works, now I have removed this from my svc yaml but still not working. You are right it is kind of alias for real URI. I revisited few articles on externalname service, following one is from google cloud. What I have understood is we use it to redirect to external URI so that you don't have to expose real URI in your app. so whenever u hit ur-external-svc.default it is suppose to redirect you to URI you wrote as externalname in your service. https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-mapping-external-services Scenario 2: Remotely hosted database with URI
If what i m trying to do from my pods and externalname service I mentioned in my initial comment is conceptually not wrong then there must be problem with coredns (even I manually updated image of it from 1.2.6 to k8s.gcr.io/coredns:1.3.1) I am saying this because the first step of troubleshooting from the link you shared get fails. Rest of the steps showing positive results after creating temp busybox pod
|
Agreed, there is something else preventing coredns from working. When doing the troubleshooting steps, be sure to use the exact version of busybox they suggest to use. There are bugs that prevent the If the troubleshooting guide doesn't help, please feel free to open a new issue (since it does not appear to be directly related to this issue). |
So.. is there any chance to get it fixed? PR closed... :( |
That PR was not the correct fix |
The PR, which allowed returning of un-terminated CNAMES (CNAMES with targets that dont have an A record), also allowed it to return CNAME loops (CNAME loops also being un-terminated). So, that fix may need additional safeguards in place to prevent it from returning CNAME loops to the client. Rather than take a garbage-in-garbage-out approach, I think @miekg would like for CoreDNS to filter out loops to protect the client. @miekg, CNAME loops aside, what is the correct DNS behavior for A queries to a CNAME with a target that doesn't have an A record: should we return NODATA, or a CNAME without an A record? |
My vote goes for |
Yeah. You need to return the cname the client should then determine this is
a useless response and error out. Maybe this case is described somewhere in
the RFCs, have to check
…On Tue, 19 Feb 2019, 20:08 Vitalii Zaitsev ***@***.*** wrote:
My vote goes for CNAME without an A record
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAVkW3DBvYf_j9EMk1cDfBuriFZ-C3s6ks5vPFm8gaJpZM4aPKA0>
.
|
Cool, thanks. I'll re-open #2493, and figure out how to filter CNAME loops form the response. |
I'm also very much confused on why giving the client a bullshit answer
fixed anything? Why is this cname there? And why isn't there a A or AAAA to
complete it?
…On Tue, 19 Feb 2019, 21:00 chrisohaver ***@***.*** wrote:
Cool, thanks. I'll re-open #2493
<#2493>, and figure out how to
filter CNAME loops form the response.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAVkW_y8jGS6YrfTphzZ_xhijxynkjY6ks5vPGXTgaJpZM4aPKA0>
.
|
In k8s, Services of type ExternalName are represented as CNAMEs in the cluster DNS (per spec).
In this example, The ExternalName Service points to a target that doesn't have an A record, because the target Service doesn't exist until the target Service is created. |
Ok. So why does that need a fix? To give a better useless answer? Any
nodata response will be ok here
…On Tue, 19 Feb 2019, 21:54 chrisohaver ***@***.*** wrote:
Why is this cname there?
In k8s, Services of type ExternalName are represented as CNAMEs in the
cluster DNS (per spec).
And why isn't there a A or AAAA to complete it?
In this example, The ExternalName Service points to a target that doesn't
have an A record, because the target Service doesn't exist until the target
Service is created.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAVkW2Ng3m_eBO2ZR9wetARciZtcYdGNks5vPHKlgaJpZM4aPKA0>
.
|
We are not really talking about K8s+DNS part. We are talking just about DNS. You configured CNAME record, and expect to get it. I don't see why we should try to be smarter than needed, do 'some optimizations' in a place where nobody expects or asked. Please show me Godaddy/AWS Route53/Bind/Unbound/Whatever who have this behavior. Maybe I missed that? |
Sorry, is there any progress? |
But why? What possible good does this to a client?? |
Do what's service was asked to do? Isn't it enough? You have symlink in, let's say ext4 filesystem (good example, isn't it? matching?).
Your way: do something 'internally' but don't show anything until How it should be, imho: show symlink pointing to some other file, which may not exist yet. But user will see that symlink is there, will see where it points to. It's users task to ensure that destination is reachable in any meaning.
I just expect service do what it was instructed to do. Not more, not less. |
Sorry. I wasn't clear. What's the use for a dangling cname for a client?
…On Sun, 24 Mar 2019, 12:08 Vitalii Zaitsev, ***@***.***> wrote:
Do what's service was asked to do? Isn't it enough?
I like examples, so I'll proceed with them...
You have symlink in, let's say ext4 filesystem (good example, isn't it?
matching?).
Please answer few questions:
- ln -s not-existent-file symlink-file
what's expected result of it? real and visible to 'user'.
Your way: do something 'internally' but don't show anything until
not-existent-file will not appear. Let user guess is something wrong or
it's symlink of Schrodinger. What possible good does this to a client??
How it should be, imho: show symlink pointing to some other file, which
may not exist yet. But user will see that symlink is there, will see where
it points to. It's users task to ensure that destination is reachable in
any meaning.
- Let's assume that not-existent-file actually was there initially, we
saw symlink with ls -la command etc. What should happen when
not-existent-file will disappear? Should symlink disappear (be hidden
actually) as well? What possible good does this to a client??
I just expect service do what it was instructed to do. Not more, not less.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAVkW9qskbBvFUgBxMFQR72Roa6lXYGzks5vZ2qqgaJpZM4aPKA0>
.
|
It's a transient state that occurs when an ExternalName service points to a Service that is not created yet. IMO, this is academic, and while technically it should be fixed for consistency sake (currently CoreDNS behaves one way when the CNAME non-existing target is in another zone, and another when its in the same zone), I don't suspect it's really hurting anyone. So if there is any resistance to fix it, I'm OK with leaving this harmless bug unfixed. One could argue that if this behavior is a problem, an affected user could work around it by creating the Service before creating the ExternalName Service. |
[ Quoting <notifications@github.com> in "Re: [coredns/coredns] fix kubernete..." ]
It's a transient state that occurs when an ExternalName service points to a Service that is not created yet.
IMO, this is academic, and while technically it should be fixed for consistency sake (currently CoreDNS behaves one way when the CNAME non-existing target is in another zone, and another when its in the same zone), I don't suspect it's really hurting anyone. So if there is any resistance to fix it, I'm OK with leaving this harmless bug unfixed.
One could argue that if this behavior is a problem, an affected user could work around it by creating the Service _before_ creating the ExternalName Service.
I don't believe we're currently cashing such responses (which is probably good).
My main problem is that we need to do an extra loop on a message and then
*still* return a useless one.
|
1. What happened?
Intro: it's about not fully/properly fixed issue #2040 , related to the kubernetes/kubernetes#67962
What happened:
I have a cluster with CoreDNS enabled
First case: Defined this service which should resolve to ClusterIP of the kubernetes service:
dig for this services DNS name will fail to show
CNAME
recordSecond case: Having defined resource from the first case, define another resource:
Exact service definition isn't important. We just need any normal service.
dig for this services DNS name will work returning correct
A
record.dig for first services DNS name will still fail to show
CNAME
record.Then, re-apply first definition, without changing anything.
And now dig will return correct
CNAME
record.Looks like:
CNAME-target
and if it fails (this could be temporary issue as I can imagine) it will not addCNAME
recordCNAME-target
actually become available/created - we still missCNAME
2. Which issues (if any) are related?
#2040
Why DNS server cares about content of
CNAME
record? This is blocker for our use cases, and K8S regression after migration fromkube-dns
.Can't we skip mentioned
CNAME
check?The text was updated successfully, but these errors were encountered: