Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install on a system using systemd-resolved leads to broken DNS #273

Closed
gjcarneiro opened this issue May 19, 2017 · 18 comments
Closed

Install on a system using systemd-resolved leads to broken DNS #273

gjcarneiro opened this issue May 19, 2017 · 18 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@gjcarneiro
Copy link

What keywords did you search in kubeadm issues before filing this one?

systemd resolved dns

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): v1.6.3
Environment:

  • Kubernetes version (use kubectl version): v1.6.3
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g. from /etc/os-release): Ubuntu 17.04
  • Kernel (e.g. uname -a): Linux gjc-XPS-8500 4.10.0-21-generic Clusters built with kubeadm don't support basic auth #23-Ubuntu SMP Fri Apr 28 16:14:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Others:

What happened?

Installed kubernetes on bare metal using kubeadm. Dns inside pods did not work.

What you expected to happen?

Would expect dns inside pods to work.

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

As noted in kubernetes/kubernetes#45828, the problem is due to the fact that on a normal Ubuntu desktop (and maybe other desktop Linux OSes), /etc/resolve.conf contains 127.0.0.35, which doesn't work inside Pods.

The correct thing to do is to add --resolv-conf=/run/systemd/resolve/resolv.conf to the kubelet config in case systemd-resolved is running with DNSStubListener and /etc/resolv.conf is configured with the local resolver (solution suggested by @antoineco and @thockin).

@timothysc
Copy link
Member

So kubeadm doesn't lay down the kubelet startup, that's done in the system unit file, which is done here: https://github.com/kubernetes/release

/cc @marcoceppi @castrojo - this appears to be an ubuntu default for desktop setups.

@timothysc timothysc added canonical kind/bug Categorizes issue or PR as related to a bug. and removed kind/bug Categorizes issue or PR as related to a bug. labels May 25, 2017
@luxas
Copy link
Member

luxas commented May 29, 2017

@timothysc @marcoceppi @castrojo Critical for v1.7?

@timothysc
Copy link
Member

@luxas no.

@erikbgithub
Copy link

Sorry, not sure if anybody will still look at closed issues. #272 is not resolved by the solution suggested here.

@erikbgithub
Copy link

Please reopen #272 or start working on this issue considering the other context as well.

@fasaxc
Copy link

fasaxc commented Dec 11, 2017

I'm hitting this when I try to use kubeadm with GCE's ubuntu-1710 image so it looks like it's not limited to the desktop install.

@mt-inside
Copy link

As an FYI: as I commented on kubernetes/kubernetes#45828, I don't believe that over-riding kubelet's resolv.conf reference will work anyway. This will just dump a broken (referencing 127.0.0.53) resolv.conf into all the pods and bypass cluster-local resolution. The current state of affairs is that just external resolution is broken because kube-dns has a broken upstream, but it is able to stub the cluster-local zones off to k8s. The only fix I can see is adding / editing config to kube-dns / CoreDNS.

NB

  • It's not just ubuntu desktop, this isn't a NetworkManager thing, this is systemd-resolved, which is used on server version 17.10 at least.
  • It's 127.0.0.53 (as in the DNS port), not 35

@antoineco
Copy link

antoineco commented Jan 9, 2018

@mt-inside that's why pointing kubelet to /run/systemd/resolve/resolv.conf makes sense because in an environment running systemd-resolved

  1. /etc/resolv.conf contains only one entry: localhost
  2. /run/systemd/resolve/resolv.conf contains your actual DNS servers

kube-dns merely uses whatever nameservers kubelet provides as its forwarders, so if kubeadm configures kubelet to use 2) instead of 1) you're all set.

@mt-inside
Copy link

@antoineco I agree that'll get kube-dns forwarding correctly, but won't every other user-level Pod in the system then go straight to your upstream servers and not query kube-dns at all? When I tried the --resolv-conf option, it just used that file verbatim and didn't inject the kube-dns Service ClusterIP (the --resolv-conf option was ignored until I removed the --cluster-dns option)

@antoineco
Copy link

antoineco commented Jan 9, 2018

By default, if --cluster-dns is set (should be!), all user workloads send DNS requests to kube-dns, which in turn does the forwarding job for you.

What you described is the behaviour of ClusterFirst.

ref https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pods-dns-policy

@mt-inside
Copy link

@antoineco Ah, you're right. I was confused about dnsPolicy. I was confused about what coredns is running as, because Default isn't the default. I also confused myself by looking at a ClusterFirst Pod that was failing back to Default when I didn't specify --cluster-dns in some of my tests. Also the scope of --resolv-conf (not applying to ClusterFirst) and --cluster-dns (not applying to Default) isn't documented, and I didn't think of it until I really grokked the different dns modes.

I agree this fix is perfectly sensible.

@timothysc timothysc added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triaged labels Jan 31, 2018
@timothysc
Copy link
Member

So what is the consensus?

@mt-inside
Copy link

@timothysc Sorry, it's not spelt out. A combination of what @antoineco says here and @thockin says on kubernetes/kubernetes#45828
Kubelet needs the argument --resolv-conf=/run/systemd/resolve/resolv.conf
My kubeadm wrapper script adds that to KUBELET_DNS_ARGS in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

However (deferring to the kubeadm authors here):

  • I don't know what kubelet's behaviour is wrt non-existant files. If it doesn't like them, this should only be done on systems running systemd-resolvd
  • You seem to think the kubelet's args file isn't laid down by kubeadm, but by https://github.com/kubernetes/release ? I take it .../10-kubeadm.conf comes from this project at least and could be used?

@codepainters
Copy link

I've hit the very same issue with kubeadm 1.10.0 and CoreDNS - with even worse results, as CoreDNS asked to resolve any external name starts looping to itself, consuming all allowed RAM and getting OOM-killed.

Obviously it can be fixed either by kubelet --resolv-conf param (as mentioned above), or by editing config map with Corefile, but it takes a moment to realise what's failing and why. It's unfortunate that default setup fails so miserably.

I've raised an issue in CoreDNS tracker for better handling of such a misconfiguration on CoreDNS side: coredns/coredns#1647

@timothysc
Copy link
Member

/assign @detiber @timothysc

@neolit123
Copy link
Member

seems like a duplicate of #787
which is being worked on.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue May 11, 2018
Automatic merge from submit-queue (batch tested with PRs 63673, 63712, 63691, 63684). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm - add preflight warning when using systemd-resolved

**What this PR does / why we need it**:

This PR adds a preflight warning when the host is running systemd-resolved.

Newer Ubuntu releases (artful and bionic in particular) run systemd-resolved by default and in the dfeault configuration have an /etc/resolv.conf file that references 127.0.0.53 which is not accessible from containers running on the host. We will now provide a warning to the user to tell them that the kubelet args should include `--resolv-conf=/run/systemd/resolve/resolv.conf`. `/run/systemd/resolve/resolv.conf`. 

**Which issue(s) this PR fixes**:
This does not resolve the following issues, but it does provide better output to the users affected by the issues: kubernetes/kubeadm#273 kubernetes/kubeadm#787

**Release note**:
```release-note
NONE
```
@luxas
Copy link
Member

luxas commented May 11, 2018

Yes, this one and #787 are duplicates. I'll close #787 as this one is older.

@luxas luxas changed the title Install on system with systemd-resolved with DNSStubListener leads to broken kube-dns Install on system with systemd-resolved leads to broken DNS May 14, 2018
@luxas luxas changed the title Install on system with systemd-resolved leads to broken DNS Install on a system using systemd-resolved leads to broken DNS May 14, 2018
@timothysc timothysc assigned timothysc and unassigned detiber May 15, 2018
@luxas
Copy link
Member

luxas commented May 29, 2018

As we have the preflight check (added in kubernetes/kubernetes#63691), I'm gonna close this
To make this work automatically, we have filed #845

Thank you a lot everyone who have contributed to fixing this!

@luxas luxas closed this as completed May 29, 2018
asksven added a commit to asksven/kubernetes-the-hard-way-vagrant that referenced this issue Sep 12, 2018
Make sure kubeletes use `/run/systemd/resolve/resolv.conf` and not `/etc/resolv.conf` to make sure that any dnsmasq / resolved installed on the workers does not interfere with the clusters DNS resolution

Refs:
kubernetes/kubeadm#273
https://blog.sophaskins.net/blog/misadventures-with-kube-dns/
vannrt added a commit to platform9/nodeadm that referenced this issue Dec 7, 2018
This problem occurs because systems using systemd-resolved copy
127.0.0.53 from the host's /etc/resolv.conf.

More discussion here: kubernetes/kubernetes#45828

Related issues:
kubernetes/kubeadm#787
kubernetes/kubeadm#273
kubernetes/kubeadm#845

The upstream fix is now in v1.11.
vannrt added a commit to platform9/nodeadm that referenced this issue Dec 7, 2018
This problem occurs because systems using systemd-resolved copy
127.0.0.53 from the host's /etc/resolv.conf.

More discussion here: kubernetes/kubernetes#45828

Related issues:
kubernetes/kubeadm#787
kubernetes/kubeadm#273
kubernetes/kubeadm#845

The upstream fix is now in v1.11.
vannrt added a commit to platform9/nodeadm that referenced this issue Dec 7, 2018
This problem occurs because kube-dns on systems using systemd-resolved
copy 127.0.0.53 from the host's /etc/resolv.conf.

Since 127.0.0.53 is a loopback address, dns queries never get past
kube-dns causing our conformance tests to fail on DNS related issues.

More discussion here: kubernetes/kubernetes#45828

Related issues:
kubernetes/kubeadm#787
kubernetes/kubeadm#273
kubernetes/kubeadm#845

The upstream fix is now in v1.11.
vannrt added a commit to platform9/nodeadm that referenced this issue Dec 7, 2018
This problem occurs because kube-dns on systems using systemd-resolved
copy 127.0.0.53 from the host's /etc/resolv.conf.

Since 127.0.0.53 is a loopback address, dns queries never get past
kube-dns causing our conformance tests to fail on DNS related issues.

More discussion here: kubernetes/kubernetes#45828

Related issues:
kubernetes/kubeadm#787
kubernetes/kubeadm#273
kubernetes/kubeadm#845

The upstream fix is now in v1.11.

Without the fix, the kubedns and dnsmasq containers would copy the host's `/etc/resolv.conf`:
```
\# This file is managed by man:systemd-resolved(8). Do not edit.
\#
\# This is a dynamic resolv.conf file for connecting local clients to the
\# internal DNS stub resolver of systemd-resolved. This file lists all
\# configured search domains.
\#
\# Run "systemd-resolve --status" to see details about the uplink DNS servers
\# currently in use.
\#
\# Third party programs must not access this file directly, but only through the
\# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
\# replace this symlink by a static file or a different symlink.
\#
\# See man:systemd-resolved.service(8) for details about the supported modes of
\# operation for /etc/resolv.conf.

nameserver 127.0.0.53
search platform9.sys
```

After the fix:
```
\# This file is managed by man:systemd-resolved(8). Do not edit.
\#
\# This is a dynamic resolv.conf file for connecting local clients directly to
\# all known uplink DNS servers. This file lists all configured search domains.
\#
\# Third party programs must not access this file directly, but only through the
\# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
\# replace this symlink by a static file or a different symlink.
\#
\# See man:systemd-resolved.service(8) for details about the supported modes of
\# operation for /etc/resolv.conf.

nameserver 10.105.16.2
nameserver 10.105.16.4
search platform9.sys
```
wkandek added a commit to wkandek/kwth-vbox that referenced this issue Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests