Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coreDNS unable to resolve upstream #53

Closed
latchmihay opened this issue Feb 26, 2019 · 11 comments
Closed

coreDNS unable to resolve upstream #53

latchmihay opened this issue Feb 26, 2019 · 11 comments
Assignees
Milestone

Comments

@latchmihay
Copy link

latchmihay commented Feb 26, 2019

Hello, I have plain installation of k3s on an ubuntu 18.04

I am running a container which is failing to resolve DNS

# nslookup index.docker.io 10.43.0.10
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'index.docker.io': Try again

# nslookup quay.io 10.43.0.10
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'quay.io': Try again
# k3s kubectl logs -f pod/coredns-7748f7f6df-8htwl -n kube-system
2019-02-26T22:52:50.556Z [ERROR] plugin/errors: 2 index.docker.io. AAAA: unreachable backend: read udp 10.42.0.6:50878->1.1.1.1:53: i/o timeout
2019-02-26T22:52:50.556Z [ERROR] plugin/errors: 2 index.docker.io. A: unreachable backend: read udp 10.42.0.6:38587->1.1.1.1:53: i/o timeout
2019-02-26T22:53:18.425Z [ERROR] plugin/errors: 2 quay.io. AAAA: unreachable backend: read udp 10.42.0.6:48427->1.1.1.1:53: i/o timeout
2019-02-26T22:53:18.425Z [ERROR] plugin/errors: 2 quay.io. A: unreachable backend: read udp 10.42.0.6:53214->1.1.1.1:53: i/o timeout

I am not sure what 1.1.1.1 is and where its coming from

@jadsonlourenco
Copy link

I am not sure what 1.1.1.1 is and where its coming from

This is the CloudFlare DNS public service, like the Google DNS 8.8.8.8

@latchmihay
Copy link
Author

Hmm, its probably being blocked on my network. Any idea how its being configured and how I could change it?

@ibuildthecloud
Copy link
Contributor

@latchmihay We may have hard coded 1.1.1.1. We will make that configurable. The default behavior of k8s is to use the hosts /etc/resolv.conf as the upstream DNS but because of systemd-resolved being the default these days (and older dnsmasq setups) it is typically 127.0.0.x IP and then breaks. So it's super hard in general to figure out what the upstream DNS should actually be. So we probably hardcoded it to 1.1.1.1.

We will add this as an option to the agent and also document it.

@jadsonlourenco
Copy link

@ibuildthecloud Please maintain the current way, make configurable, but if you keep save a log of time because of the issue that you said, I've got it many time, and need be a new step on the new servers installations...
Anyway thank you, I was hoping it to migrate to the new Rancher v2.

@bechampion
Copy link

i fixed it changing the configmap for coredns from 1.1.1.1 to 8.8.8.8 ... for whatever reason 1.1.1.1:53 I could not reach

@DMW007
Copy link

DMW007 commented Mar 3, 2019

This can be done by replacing proxy . 1.1.1.1 with your own dns server in cm coredns. I wrote a detailled guide how to change this manually and automated for tools like ansible here: https://devops.stackexchange.com/a/6521/6923

@erikwilson
Copy link
Contributor

We have created a release candidate v0.3.0-rc3 which will hopefully fix these DNS issues. Please try it out and let me know if it helps!

The settings are configurable in that we will either take a --resolv-conf flag to pass down to the kubelet, or a K3S_RESOLV_CONF environment variable will work also. We now try to use system resolv.conf files (from /etc & systemd), and will create a /tmp/k3s-resolv.conf file with nameserver 8.8.8.8 if nameservers in the system files are not global unicast ips.

@DMW007
Copy link

DMW007 commented Mar 30, 2019

I tried it out on Ubuntu 16.04.6 LTS with v0.3.0 (9a1a1ec) since the final 0.3 got released a few hours ago. Using curl -sfL https://get.k3s.io | K3S_RESOLV_CONFIG=192.168.0.19 sh - and removing my sed workaround from cm/coredns it works, but only without providing a custom TLD:

root@rocket-chat:/# ping my-pc
PING my-pc.fritz.box (192.168.0.20) 56(84) bytes of data.
64 bytes from my-PC.fritz.box (192.168.0.20): icmp_seq=1 ttl=61 time=0.787 ms

But when I try ping my-pc.fritz.box it can't resolve. nslookup also timed out:

root@rocket-chat:/# nslookup my-pc.fritz.box
;; connection timed out; no servers could be reached

Using other machines in the same networks that use 192.168.0.19 as dns-server, both domains were resolved successfully. Altough inside Vagrant I'm able to resolve my-pc.fritz.box, it may has something to do that I'm trying this in Vagrant on Ubuntu 18.04. Content of /etc/resolv.conf inside vagrant:

nameserver 10.0.2.2
search fritz.box

Update: It's a Kubernetes issue

Found out that this was caused by Kubernetes ndots config. Per default, we have options ndots:5 set in resolv.conf. This means that dns names must contain at least five dots before they were processed as an absolute name. my-pc doesn't contain any dots, so it's resolved absolute by our upsteam 192.168.0.19 where we have an alias without .fritz.box suffix by default.

But my-pc.fritz.box contains two dots. The default setting is ndots:1 so that any dns name with at least one dot would be resolved as absolute dns. Since Kubernetes has ndots:5 the my-pc.fritz.box is resolved as relative dns. So it would apply all suffixes from search. This can't work since it would apply another .fritz.box suffix, so my-pc.fritz.box would become my-pc.fritz.box.fritz.box.

I assume that this should speed up things for internal cluster dns entrys. But for external dns, it can slow down things. Using apt-get for installing some debug packages like netutils was very slow. Since I switched to the default ndots:5 it got pretty fast like on my working machine. You can also find blog posts about this issue. But in my case, the primary problem was that it breaks my absolute external dns entrys.

To solve this, customize the pods dns configuration by applying it to containerlevel at the pods definition:

  containers:
  # ...
  dnsConfig:
    options:
      - name: ndots
        value: "1"

But regarding to Kubernetes own dns, I'd consider this as a workaround for local purpose since I'm not completely aware of the productive peformance yet. As another solution, we may force absolute domain names by a leading dot.

Currently, I'm using thednsConfig entry and dns works well with my custom server. So this problem wasn't related to k3s directly and the fix in 0.3 works well :)

@lindhe
Copy link

lindhe commented Sep 25, 2020

Is there any way to set the dnsConfig options globally instead of on a per-pod basis?

@brettinternet
Copy link

For anyone arriving here from a search engine, I was able to resolve my cluster's DNS issues by

(a) using the legacy iptables rather than nftables, (b) ensuring the CNI is correctly installed (I use Calico with hardware that has multiple NICs and this requires additional setup for IP detection), and (c) flushing the iptables leftover from the CNI in between cluster installs.

iptables --version
# iptables v1.8.7 (legacy)
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -t nat -F
iptables -t mangle -F
iptables -F
iptables -X
# ... Install k3s

@akelge
Copy link

akelge commented Jan 19, 2024

Not completely on topic, but the fact that issue 53 is related to DNS issues sounds done on purpose :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants