New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution fails on Ubuntu k8s cluster #1017

Open
venilnoronha opened this Issue Jul 10, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@venilnoronha

venilnoronha commented Jul 10, 2018

The nslookup command fails to resolve service addresses like kubernetes.default on a busybox pod running on a k8s cluster deployed over Ubuntu VMs with the flannel plugin. The same, however, works with the Weave Net plugin.

Expected Behavior

DNS resolution (nslookup) should work right out of the box on a busybox pod for addresses like kubernetes.default.

Current Behavior

The following error is observed.

root@master-0-ubuntu1604:~# kubectl exec -ti busybox -- nslookup kubernetes.default
Server:    10.96.0.10
Address 1: 10.96.0.10

nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1

Steps to Reproduce (for bugs)

  1. Deploy a k8s cluster using kubeadm. Ensure that --pod-network-cidr=10.244.0.0/16 is set with kubeadm init.
  2. Deploy flannel: kubectl create -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml
  3. Deploy a busybox pod: kubectl create -f https://k8s.io/examples/admin/dns/busybox.yaml
  4. Lookup kubernetes.default via nslookup: kubectl exec -ti busybox -- nslookup kubernetes.default

Context

Pods from Istio failed to start on my k8s cluster when using Flannel; however, they worked perfectly fine with Weave Net. istio/istio#5379 describes the issue in further detail.

Your Environment

  • Flannel version: v0.10.0/v0.9.0
  • Backend used (e.g. vxlan or udp): vxlan
  • Kubernetes version: v1.11.0
  • Operating System and version: Ubuntu 16.04 (Kernel: 4.4.0-42-generic x86_64 GNU/Linux)
@randyrue

This comment has been minimized.

randyrue commented Jul 18, 2018

I'm having the same problem with kubernetes 1.10.5 and Ubuntu 18.04 LTS. I'm new to Kubernetes and am probably misunderstanding how this all works. Would appreciate any guidance.

If I go to the shell on a busybox pod I can't resolve kubernetes.default or kubernetes.x where x is one of three namespaces I've created.

But I'm getting slightly farther than the OP. When I run nslookup I get:

/ # nslookup kubernetes.default
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'kubernetes.default'
/ #

This suggests that the nameserver IP is successfully resolving to a PTR record of some sort? Sure enough, testing nslookup against the above name of the nameserver works:

/ # nslookup kube-dns.kube-system.svc.cluster.local
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kube-dns.kube-system.svc.cluster.local
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
/ #

So nslookup appears to be able to reach a 10.96 IP even though there are no interfaces, IPs or routing entries on my nodes or pods related to any 10.xxx network.

And it appears that kube-dns is not forwarding other requests to the node's nameserver entries, which are present in /etc/netplan/bond0.yaml.

What am I missing?
How is traffic reaching kube-dns at a 10.96 IP?
Where/how should kube-dns know to forward lookups to the node's nameserver setting?

Hope to hear from you...

@randyrue

This comment has been minimized.

randyrue commented Jul 18, 2018

OK, found this other earlier issue: #983

Which is much like mine, they're also getting the reverse lookup for the kube-dns pod

Added the configmap for kube-dns with the upstreamNameserver setting pointed at my on-premise nameservers and I can now nslookup both outside and on-premise hostnames but it takes just about exactly 20s to return an address. In fact, for google.com it takes 20s to return an IPv6 address and then another 20s to return IPv4.

This smells like a timeout setting that needs to be changed.

@randyrue

This comment has been minimized.

randyrue commented Jul 18, 2018

OK, it's getting weirder.

After deleting and relaunching the busybox pod, nslookup now immediately returns a "non-authoritative" answer. But then it hangs for 5s exactly and returns an error "*** Can't find s---.org: No answer"

However, calls like ping resolve and reply right away. I think this is an nslookup quirk, and I'm calling it good for now.

@hokiegeek2

This comment has been minimized.

hokiegeek2 commented Aug 24, 2018

From inside busybox pod after applying upstreamServer config @lukaszpy mentioned above I get the following for external nslookups (minikube v0.28.1 on ubuntu 18.04):

/ # nslookup 8.8.8.8
Server: 10.96.0.10
Address: 10.96.0.10:53

Non-authoritative answer:
8.8.8.8.in-addr.arpa name = google-public-dns-a.google.com

/ # nslookup google.com
Server: 10.96.0.10
Address: 10.96.0.10:53

Non-authoritative answer:
Name: google.com
Address: 172.217.5.238

*** Can't find google.com: No answer

/ #

When I attempt kubernetes.default, I get this:
/ # nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10:53

** server can't find kubernetes.default: NXDOMAIN

*** Can't find kubernetes.default: No answer

/ #

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment