Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Known Issue] Private DNS with .local entries won't work after Kuberentes 1.18/Ubuntu 18. #2052

Closed
paulgmiller opened this issue Jan 13, 2021 · 17 comments
Labels
known-issue resolution/answer-provided Provided answer to issue, question or feedback.

Comments

@paulgmiller
Copy link
Member

paulgmiller commented Jan 13, 2021

If your vitual network has custom dns server and uses a dns record that ends with .local then it will no longer work after your nodes go up to to ubuntu 18. This a happens automatically when you upgrade to k8s 1.17.
This is because ubuntu 18 usses syste
https://askubuntu.com/questions/917784/systemd-resolved-does-not-query-dns-server-for-local-domain
https://www.man7.org/linux/man-pages/man8/systemd-resolved.service.8.html

lookups for domains with the ".local" suffix are
not routed to DNS servers, unless the domain is specified
explicitly as routing or search domain for the DNS server and
interface. This means that on networks where the ".local"
domain is defined in a site-specific DNS server, explicit
search or routing domains need to be configured to make
lookups work within this DNS domain. Note that these days,
it's generally recommended to avoid defining ".local" in a
DNS server, as RFC6762[2] reserves this domain for exclusive
MulticastDNS use.

Temporary mitigations:
You can leave one agentpool behind on 1.17 but eventually you will need to use a different record in your private dns server.
Changing the /etc/resolve.conf symlink to point at the static /run/systemd/resolve/resolv.conf file may also work though not tested yet and we don't have a daemonset to do this for you.

We are working on fixing this.

@paulgmiller
Copy link
Member Author

Relevant disucssion of this in systemd issue: systemd/systemd#13763
Someone else trying to use a daemonset to bypass systemd-resolve as mentioned above https://gist.github.com/levimm/c27a8940479e4f17e53da0d9d477defe#file-dnsresolv-yaml
Not vetted nor tested by AKS so use entirely at own risk.

@paulgmiller
Copy link
Member Author

Also curious if setting ResolveUnicastSingleLabel=yes would unblock .local without completely disabling systemd. @xuto2

@joaguas
Copy link

joaguas commented Feb 4, 2021

Seems that pods are not affected by this issue, even when using hostNetwork:

image

After digging for a while it seems it's --resolv-conf=/run/systemd/resolve/resolv.conf on kubelet holding it together meaning coredns or other hostnetworked will actually get the upstream resolvers effectively bypassing the nss-resolve/systemd-resolved resolver.

Makes sense since otherwise the loop plugin on the coredns cm would halt coredns due to the loopback circular reference.

As discussed with @paulgmiller this will still impact nodes (ie: pulling images from a .local registry).

@xuto2

@xuto2
Copy link
Contributor

xuto2 commented Feb 5, 2021

@joaguas thanks a lot for sharing. It's true this doesn't affect pod traffic as we understand as well. The daemonset approach could be a temp mitigation while we're evaluating a permanent solution from aks node side.

@bbgobie
Copy link

bbgobie commented Feb 8, 2021

Also curious if setting ResolveUnicastSingleLabel=yes would unblock .local without completely disabling systemd. @xuto2

I'm not sure if it would, but a fix needs to encompass more than just .local
As the systemd-resolv change seems to make our nodes not use the dns servers from the network we can also not resolve internal resources like private end points

@ghost ghost added the action-required label Mar 5, 2021
@miwithro miwithro unpinned this issue Mar 9, 2021
@ghost ghost removed the action-required label Mar 9, 2021
@bbgobie
Copy link

bbgobie commented Mar 16, 2021

1.17 End of Support is approaching pretty quick, is there official word from AKS on what they want us to do with this? We all run an unsupported daemonset to fix it?

@xuto2
Copy link
Contributor

xuto2 commented Mar 16, 2021

we're disabling resolved on all new 1804 VMs, the change is in release and expected to be done by next week. AKS release notes https://github.com/Azure/AKS/releases/tag/2021-03-08 also mentioned it.

@ghost
Copy link

ghost commented Apr 16, 2021

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Apr 16, 2021
@ghost
Copy link

ghost commented May 1, 2021

Issue needing attention of @Azure/aks-leads

7 similar comments
@ghost
Copy link

ghost commented May 16, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented May 31, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jun 16, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jul 2, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jul 17, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Aug 1, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Aug 16, 2021

Issue needing attention of @Azure/aks-leads

@miwithro miwithro added the resolution/answer-provided Provided answer to issue, question or feedback. label Aug 16, 2021
@ghost ghost removed action-required Needs Attention 👋 Issues needs attention/assignee/owner labels Aug 16, 2021
@ghost
Copy link

ghost commented Aug 19, 2021

Thanks for reaching out. I'm closing this issue as it was marked with "Answer Provided" and it hasn't had activity for 2 days.

@ghost ghost closed this as completed Aug 19, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 18, 2021
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
known-issue resolution/answer-provided Provided answer to issue, question or feedback.
Projects
None yet
Development

No branches or pull requests

6 participants