kube-dns 1.14.7 does not resolve cluster services without external dns #169

tuminoid · 2017-11-29T14:49:03Z

If host does not have working DNS nameserver in /etc/resolv.conf, kube-dns fails to resolve cluster services.

I created a single-machine k8s cluster (1.7.4, kube-dns 1.14.7) in a VM for a machine that has no internet/intranet connectivity for a hack lab. Thus, it has no working DNS on the host, but the DNS IP in the /etc/resolv.conf is blocked by firewall.

In this case, kube-dns fails to resolve any cluster service in any namespace, including kube-dns itself, unless queried using FQDN, kube-dns.kube-system.svc.cluster.local, despite /etc/resolv.conf in container pointing correctly to kube-dns, and containing correct search options (kube-system.svc.cluster.local svc.cluster.local cluster.local).

The text was updated successfully, but these errors were encountered:

tuminoid · 2017-11-29T16:03:53Z

This is 100% reproducible on a vagrant box too. Just change nameserver to a IP that doesn't point anywhere.

As a side note, if /etc/resolv.conf is missing from VM, kube-dns won't even start. Does that warrant separate issue?

bowei · 2017-11-29T19:13:16Z

Can you try creating a pod with an unbound server that serves NXDOMAIN (see this gist) and setting it as the upstream nameserver using the kube-dns configmap?

tuminoid · 2017-11-30T08:28:21Z

Using configurations in this gist (please correct if something wrong), this does not help. It also doesn't make a difference if nameserver 192.168.200.7 is used for hosts /etc/resolv.conf. Also tried without hostNetwork, no difference. Removing unbound's access-control makes no difference either.

If I use kube-dns 1.9, which we had prior upgrading to 1.14.7, it just works, no matter the resolv.conf on the host has.

tuminoid · 2017-12-12T06:36:52Z

Made it work with unbound and upstreamNameservers. It appears that local-zone "." is not valid (or doesn't trigger right response), but adding local-zone "local." and local-zone "cluster.local." made the trick. I'm thus now running unbound in a container with clusterIP: 10.254.0.3 and pointing kube-dns upstreamNameservers to that IP.

That said, I'd consider needing such tricks a bug, especially when its regression from kube-dns 1.9.

fejta-bot · 2018-03-12T07:18:33Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

tuminoid · 2018-03-12T20:04:13Z

/remove-lifecycle stale

Very much valid issue still.

mikehollinger · 2018-04-30T20:11:26Z

Hitting this as well!

jjustinwhite · 2018-06-01T16:05:10Z

I had a similar issue, but removing hostNetwork fixed it for me, so not the same as above. Figured I'd mention it though, since this post helped me realize it was having hostNetwork set to true, that was causing my pods to be unable to resolve the FQDNS.

tuminoid · 2018-08-15T08:13:18Z

Thanks @MrHohn for a ping, but unfortunately kubernetes/kubernetes#67302 does not resolve this issue. With non-responding DNS server, kube-dns still fails to resolve cluster-local DNS names and just times out. Tested with k8s 1.10.4 and kube-dns 1.14.10.

chrisohaver · 2018-08-15T13:34:56Z

kube-dns fails to resolve any cluster service in any namespace, including kube-dns itself, unless queried using FQDN

@tuminoid, How are you executing the queries? For example, if using dig I recall it does not follow the search path unless you specify +search ...

(edit) Also , recent busybox builds have a "broken" nslookup, that does not follow search path.

MrHohn · 2018-08-15T17:15:27Z

With non-responding DNS server, kube-dns still fails to resolve cluster-local DNS names and just times out. Tested with k8s 1.10.4 and kube-dns 1.14.10.

Humm, I got a different result than what you described, probably our setup is different.

I basically set upstreamNameservers in kube-dns configmap to [127.0.0.1] such that any external name should be unresolvable. With kubernetes/kubernetes#67302, below is what I got within the cluster.

:~# nslookup kubernetes.default.svc.cluster.local.
Server:         10.0.0.10                                          
Address:        10.0.0.10#53
                                                                                         
Name:   kubernetes.default.svc.cluster.local
Address: 10.0.0.1                                                                        
                                                 
:~# nslookup google.com                     
Server:         10.0.0.10                            
Address:        10.0.0.10#53                                                             
                                                                                         
** server can't find google.com: REFUSED

cc @hedayat

tuminoid · 2018-08-16T06:16:10Z

@chrisohaver I'm using busybox with nslookup, and it works as it should in both positive and negative test. I'm aware of dig behaving differently, won't use it for testing.

@MrHohn:
Test case is if kubernetes and kubernetes.default resolves correctly. kubernetes.default.svc.cluster.local always resolves, there is never issue with that.

Result is the same whether you set upstreamNameServers in kube-dns config or have the same directly in /etc/resolv.conf:

IP with no route to it: each DNS request hangs until timeout, and nothing gets resolved
127.0.0.1 (something has route but it refuses connection): instant lookup failure for everything, no timeouts
No IP address at all. Same as above. Note that in case upstreamNameServers is empty, it falls back to resolv.conf, which also then has to be empty.
`IP of any DNS server that connects, regardless of query response>: everything resolves as they should

fejta-bot · 2018-11-14T06:18:35Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-12-14T07:03:07Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-01-13T07:47:41Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-01-13T07:47:48Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tuminoid mentioned this issue Dec 12, 2017

upstreamNameservers does not allow custom port number #178

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 12, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 12, 2018

bowei added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 3, 2018

helcorin mentioned this issue Aug 1, 2018

Latest Build - DNS Does not Resolve kelseyhightower/kubernetes-the-hard-way#356

Closed

MrHohn mentioned this issue Aug 14, 2018

add --dns-loop-detect option to dnsmasq run by kube-dns kubernetes/kubernetes#67302

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 14, 2018

k8s-ci-robot closed this as completed Jan 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube-dns 1.14.7 does not resolve cluster services without external dns #169

kube-dns 1.14.7 does not resolve cluster services without external dns #169

tuminoid commented Nov 29, 2017

tuminoid commented Nov 29, 2017

bowei commented Nov 29, 2017 •

edited

tuminoid commented Nov 30, 2017

tuminoid commented Dec 12, 2017

fejta-bot commented Mar 12, 2018

tuminoid commented Mar 12, 2018

mikehollinger commented Apr 30, 2018

jjustinwhite commented Jun 1, 2018

tuminoid commented Aug 15, 2018

chrisohaver commented Aug 15, 2018 •

edited

MrHohn commented Aug 15, 2018

tuminoid commented Aug 16, 2018 •

edited

fejta-bot commented Nov 14, 2018

fejta-bot commented Dec 14, 2018

fejta-bot commented Jan 13, 2019

k8s-ci-robot commented Jan 13, 2019

kube-dns 1.14.7 does not resolve cluster services without external dns #169

kube-dns 1.14.7 does not resolve cluster services without external dns #169

Comments

tuminoid commented Nov 29, 2017

tuminoid commented Nov 29, 2017

bowei commented Nov 29, 2017 • edited

tuminoid commented Nov 30, 2017

tuminoid commented Dec 12, 2017

fejta-bot commented Mar 12, 2018

tuminoid commented Mar 12, 2018

mikehollinger commented Apr 30, 2018

jjustinwhite commented Jun 1, 2018

tuminoid commented Aug 15, 2018

chrisohaver commented Aug 15, 2018 • edited

MrHohn commented Aug 15, 2018

tuminoid commented Aug 16, 2018 • edited

fejta-bot commented Nov 14, 2018

fejta-bot commented Dec 14, 2018

fejta-bot commented Jan 13, 2019

k8s-ci-robot commented Jan 13, 2019

bowei commented Nov 29, 2017 •

edited

chrisohaver commented Aug 15, 2018 •

edited

tuminoid commented Aug 16, 2018 •

edited