Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS searches across namespaces is BROKEN on some OSes #10161

Closed
thockin opened this issue Jun 21, 2015 · 28 comments
Closed

DNS searches across namespaces is BROKEN on some OSes #10161

thockin opened this issue Jun 21, 2015 · 28 comments
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Milestone

Comments

@thockin
Copy link
Member

thockin commented Jun 21, 2015

Debian wheezy:

root@d7c52f99509a:/# dig +search +noall +answer kubernetes.default
kubernetes.default.svc.cluster.local. 30 IN A   10.0.0.1
root@d7c52f99509a:/# dig +search +noall +answer +ndots=3 kubernetes.default
kubernetes.default.svc.cluster.local. 30 IN A   10.0.0.1

Debian jessie (and ubuntu 14.04):

root@8f7524ea72c0:/# dig +search +noall +answer kubernetes.default
root@8f7524ea72c0:/# dig +search +noall +answer +ndots=3 kubernetes.default
kubernetes.default.svc.cluster.local. 30 IN A   10.0.0.1

Our DNS tester just happens to be based on wheezy, as does our container VM image. Apparently this does not (and has never?) worked for other OSes. There is a resolv.conf param called "ndots" that governs this and it is documented as defaulting to 1 while we need it to be at least 3. I found a thread where someone else also saw this on Debian late 2014, but no root cause.

As far as I can tell, there is no way to get docker to write this resolv.conf option. I will open a bug with them.

@smarterclayton @bgrant0607 @brendandburns @ArtfulCoder

I am marking this as P0 for discussion so we can decide what, if anything, to do.

@thockin thockin added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. team/cluster labels Jun 21, 2015
@thockin thockin self-assigned this Jun 21, 2015
@ArtfulCoder
Copy link
Contributor

Per https://docs.docker.com/articles/networking/, docker mimics the /etc/resolv.conf from host, onto the container if we dont use dns flags while creating the container.

So one way could be to modify the /etc/hosts file on each node to have the additional nameserver, new search paths and the ndots option.

This also helps make cluster dns available to nodes without having to specify 10.0.0.10

Proof-of-concept below..
kubelet could potentially be responsible for updating the /etc/resolv.conf on the node it runs on.

root@kubernetes-minion-35cn:/home/abshah# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

docker run -ti ubuntu:14.04 /bin/bash
root@37e8c6e04b7b:/# apt-get install -y dnsutils
root@37e8c6e04b7b:/# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

root@37e8c6e04b7b:/# dig +search +noall +answer kubernetes.default
kubernetes.default.cluster.local. 30 IN A       10.0.0.1

@smarterclayton
Copy link
Contributor

That would explain why we didn't see this in most of our (OpenShift) testing - we moved the resolver entry for service DNS to the node resolv.conf so that the docker daemon could resolve service DNS names and apply SSL certificates to the internal hosted Docker registry (which was a service).

On Jun 21, 2015, at 2:56 AM, Abhi Shah notifications@github.com wrote:

Per https://docs.docker.com/articles/networking/, docker mimics the /etc/resolv.conf from host, onto the container if we dont use dns flags while creating the container.

So one way could be to modify the /etc/hosts file on each node to have the additional nameserver, new search paths and the ndots option.

This also helps make cluster dns available to nodes and not just pods.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member Author

thockin commented Jun 21, 2015

Do you set ndots? How do you do namespace-relative resolving without a
search path for each namespace?
On Jun 21, 2015 9:09 AM, "Clayton Coleman" notifications@github.com wrote:

That would explain why we didn't see this in most of our (OpenShift)
testing - we moved the resolver entry for service DNS to the node
resolv.conf so that the docker daemon could resolve service DNS names and
apply SSL certificates to the internal hosted Docker registry (which was a
service).

On Jun 21, 2015, at 2:56 AM, Abhi Shah notifications@github.com
wrote:

Per https://docs.docker.com/articles/networking/, docker mimics the
/etc/resolv.conf from host, onto the container if we dont use dns flags
while creating the container.

So one way could be to modify the /etc/hosts file on each node to have
the additional nameserver, new search paths and the ndots option.

This also helps make cluster dns available to nodes and not just pods.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

To make the resolv.conf pod-specific, kubelet could create a pod-specific foo-pod-resolve.conf and mount foo-pod-resolv.conf to /etc/resolv.conf for the pod's pause container.
I tested it and verified that docker does not interfere with the mount to /etc/resolv.conf on the container.

This is not a clean solution, but could be a workaround till docker gives the ndots flag.

@thockin
Copy link
Member Author

thockin commented Jun 21, 2015

Abhi,

Great idea but I can't find a way to make Docker respect that as a volume
mount - it seems to always write it's own, no matter what

On Sun, Jun 21, 2015 at 1:18 PM, Abhi Shah notifications@github.com wrote:

To make the resolv.conf pod-specific, kubelet could create a pod-specific
foo-pod-resolve.conf and mount foo-pod-resolv.conf to /etc/resolv.conf for
the pod's pause container.

This is not a clean solution, but could be a workaround till docker gives
the ndots flag.


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@smarterclayton
Copy link
Contributor

Yeah it's not possible to mount over

On Jun 21, 2015, at 5:52 PM, Tim Hockin notifications@github.com wrote:

Abhi,

Great idea but I can't find a way to make Docker respect that as a volume
mount - it seems to always write it's own, no matter what

On Sun, Jun 21, 2015 at 1:18 PM, Abhi Shah notifications@github.com wrote:

To make the resolv.conf pod-specific, kubelet could create a pod-specific
foo-pod-resolve.conf and mount foo-pod-resolv.conf to /etc/resolv.conf for
the pod's pause container.

This is not a clean solution, but could be a workaround till docker gives
the ndots flag.


Reply to this email directly or view it on GitHub
#10161 (comment)
.


Reply to this email directly or view it on GitHub.

@ArtfulCoder
Copy link
Contributor

// node has the custom conf file (foo.conf)
root@kubernetes-minion-35cn:/home/abshah# cat /etc/foo.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

// create a docker container with the volume mount
root@kubernetes-minion-35cn:/home/abshah# docker run -ti -v /etc/foo.conf:/etc/resolv.conf ubuntu:14.04 /bin/bash

// cat the /etc/resolv.conf on the container
root@40974bb94981:/# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

// run a second container with --net = first container 
docker run -ti --net=container:40974bb94981 ubuntu:14.04  /bin/bash

root@40974bb94981:/# dig +search +noall +answer kubernetes.default
kubernetes.default.cluster.local. 30 IN A       10.0.0.1

root@40974bb94981:/# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

If we modify kubelet to mount the custom resolv.conf on the pause container, it should work.

Edit: The above doesn't work. Seen comments below

@thockin
Copy link
Member Author

thockin commented Jun 22, 2015

That is not the behavior I see.

$ cat /tmp/res
nameserver 10.0.0.10
search hockin.org
options ndots:3

$ docker run -ti -v /tmp/res:/etc/resolv.conf busybox cat /etc/resolv.conf
# $Id:
//depot/google3/googledata/corp/puppet/goobuntu/common/modules/network/files/resolvconf/head#4
$

# Warning:
# --------
# The head ('/etc/resolvconf/resolv.conf.d/head') of '/etc/resolv.conf' is
# managed by Puppet and changes to this file will be overwritten!
# Furthermore '/etc/resolv.conf' is managed by 'resolvconf' and changes will
# be overwritten! See 'man 8 resolvconf' for more information.

# IMPORTANT NOTE:
# On desktops and laptops, NetworkManager invokes dnsmasq to provide a
caching
# local DNS servers. DnsMasq is usually configured to _not_ query DNS
servers
# in the order listed here, but to instead prefer servers that are known to
be
# up. To change, run:
#   sudo goobuntu-config -U set dns_strict_order true
#   sudo bash -c 'echo strict-order
>/etc/NetworkManager/dnsmasq.d/strict_order'
#   sudo service network-manager restart
# You might need to disconnect and reconnect your network.
# To revert, run:
#   sudo goobuntu-config -u set dns_strict_order recommended
# Your DNS config will revert to the default when you reboot (or restart
# network-manager and reconnect).
search hockinhome.org

# $Id:
//depot/google3/googledata/corp/puppet/goobuntu/common/modules/network/files/resolvconf/tail#4
$

# Warning:
# --------
# The tail ('/etc/resolvconf/resolv.conf.d/tail') of '/etc/resolv.conf' is
# managed by Puppet and changes to this file will be overwritten!

# It is common within Google to lookup hostnames with a dot in it like for
an
# example '<machine>.<site>'. We set 'options ndots:2' so that such
hostnames
# get tried first with the Google search domains appended. Without this such
# queries might be tried first against an external DNS resolver.
options ndots:2

nameserver 8.8.8.8
nameserver 8.8.4.4

On Sun, Jun 21, 2015 at 5:25 PM, Abhi Shah notifications@github.com wrote:

// node has the custom conf file (foo.conf)
root@kubernetes-minion-35cn:/home/abshah# cat /etc/foo.conf
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

// create a docker container with the volume mount
root@kubernetes-minion-35cn:/home/abshah# docker run -ti -v /etc/foo.conf:/etc/resolv.conf ubuntu:14.04 /bin/bash

// cat the /etc/resolv.conf on the container
root@40974bb94981:/# cat /etc/resolv.conf
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

The reason the mount didnt work from the pod definition is because the
volume did not get mounted on the pause container which controls the
network namespace. If we modify kubelet to mount the custom resolv.conf on
the pause container, it should work.


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

@thockin I see the behavior you see.
My node's /etc/resolv.conf was modified as well in my initial test and when I tried to mount /etc/resolv.conf it was the same version. So I couldnt see that docker was overriding it.

Node's /etc/resolv.conf always wins.

Edit: Check comment below. we can simply umount /etc/resolv.conf and the user-defined mount becomes visible.

@ArtfulCoder
Copy link
Contributor

@thockin @smarterclayton umount /etc/resolv.conf does the trick. (if container is running as privileged)

$ docker run -ti --privileged -v /tmp/foo.conf:/etc/resolv.conf ubuntu:latest /bin/bash

root@ab7c3cd9f8d3:/# cat /etc/resolv.conf 
search corp.google.com prod.google.com prodz.google.com google.com
options ndots:2
nameserver 8.8.8.8
nameserver 8.8.4.4

root@ab7c3cd9f8d3:/# umount /etc/resolv.conf

root@ab7c3cd9f8d3:/# cat /etc/resolv.conf 
nameserver 10.0.0.10
search hockin.org
options ndots:3

@ArtfulCoder
Copy link
Contributor

Unfortunately, this means that every container would have to be able to run as privileged to execute umount. Maybe we can use nsenter to unmount without having to run the docker container as privileged.

@ArtfulCoder
Copy link
Contributor

To run the container in non-privileged mode, and still override resolv.conf we can do the following:

$ docker run -ti --privileged -v /tmp/foo.conf:/etc/resolv.conf ubuntu:latest /bin/bash
root@ab7c3cd9f8d3:/# cat /etc/resolv.conf 
search corp.google.com prod.google.com prodz.google.com google.com
options ndots:2
nameserver 8.8.8.8
nameserver 8.8.4.4

$ pid_of_container=`docker inspect --format "{{ .State.Pid }}" ab7c3cd9f8d3`
$ nsenter -m -u -n -i -p -t $pid_of_container umount /etc/resolv.conf

root@9c872e33a219:/# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
options ndots:3

@thockin
Copy link
Member Author

thockin commented Jun 22, 2015

And they would all have to have and run umount. This is a non-starter, I
think.

On Mon, Jun 22, 2015 at 7:53 AM, Abhi Shah notifications@github.com wrote:

Unfortunately, this means that every container would have to be able to
run as privileged to execute umount..


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

Steps would be:

  1. kubelet starts every container (non-privileged) with an additional mount /etc/containerFoo_resolv.conf:/etc/resolv.conf
  2. kubelet runs nsenter -m -u -n -i -p -t $pid_of_container umount /etc/resolv.conf

@thockin
Copy link
Member Author

thockin commented Jun 22, 2015

This is a race with containers actually starting up.

On Mon, Jun 22, 2015 at 8:43 AM, Abhi Shah notifications@github.com wrote:

Steps would be:

  1. kubelet starts every container with an additional mount
    /etc/containerFoo_resolv.conf:/etc/resolv.conf
  2. kubelet runs nsenter -m -u -n -i -p -t $pid_of_container umount
    /etc/resolv.conf


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

The node's /etc/resolv.conf isn't wrong.. it is just a subset and does not resolve cluster dns names.
We will have a more complete /etc/resolv.conf eventually.

Also, since ndots is not specified in the original /etc/resolv.conf, we don't have to worry about incorrect resolutions..
We do this only where the ndots workaround is required.

// Node's /etc/resolv.conf: 
root@kubernetes-minion-35cn:/home/abshah# cat /etc/resolv.conf 
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 169.254.169.254
nameserver 10.240.0.1

// Container's /etc/resolv.conf
root@kubernetes-minion-35cn:/home/abshah# docker exec -ti  5c1943bfbbd9 sh
/ # cat /etc/resolv.conf 
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
search default.svc.cluster.local svc.cluster.local cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.

@smarterclayton
Copy link
Contributor

On Jun 22, 2015, at 11:59 AM, Abhi Shah notifications@github.com wrote:

The node's /etc/resolv.conf isn't wrong.. it is just insufficient and does not resolve cluster dns names.
We will have a more complete /etc/resolv.conf eventually.

Also, since ndots is not specified in the original /etc/resolv.conf, we don't have to worry about incorrect resolutions..
We do this only where the ndots workaround is required.

We could also use dnsmasq on the node and configure the container to point to that. I believe we can do anything at that point (assuming dnsmasq supports ndot arbitrarily).
// Node's /etc/resolv.conf:
root@kubernetes-minion-35cn:/home/abshah# cat /etc/resolv.conf
domain c.abshah-kubernetes-001.internal.
search cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.
nameserver 169.254.169.254
nameserver 10.240.0.1

// Container's /etc/resolv.conf
root@kubernetes-minion-35cn:/home/abshah# docker exec -ti 5c1943bfbbd9 sh
/ # cat /etc/resolv.conf
nameserver 10.0.0.10
nameserver 169.254.169.254
nameserver 10.240.0.1
search default.svc.cluster.local svc.cluster.local cluster.local c.abshah-kubernetes-001.internal. 808746819879.google.internal. google.internal.

Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor

Although I guess that's still controlled inside the container. I don't know if there's a way around having ndots written to inner container. Everything else is arbitrary search.

@ArtfulCoder
Copy link
Contributor

We can do the following:

  • kubelet starts the container with all the dns flags it uses today (additional nameserver, domain etc)
  • Additionally kubelet adds the /etc/resolv.conf mount

At this point, functionality-wise, nothing has changed. The mounted /etc/resolv.conf is ignored.
For containers without the ndots problem, life goes on as before.
Containers with the ndots problem still face the same problems they face today.

  • kubelet umounts /etc/resolv.conf

containers with no problems keep working since the new /etc/resolv.conf is essentially the same
containers with problems with cluster dns resolution, see the cluster dns names being resolved.

So we dont go backwards with this approach..

@thockin
Copy link
Member Author

thockin commented Jun 22, 2015

This might just work. It's a little scary, but it might work.
On Jun 22, 2015 9:20 AM, "Abhi Shah" notifications@github.com wrote:

We can do the following:

  1. kubelet starts the container with all the dns flags it uses today
    (additional nameserver, domain etc)
  2. Additionally kubelet adds the /etc/resolv.conf mount

At this point, functionality-wise, nothing has changed. The
/etc/resolv.conf is ignored. but for containers without the ndots problem,
life goes on as before.
containers with the ndots problem still face the same problems they face
today.

  1. kubelet umounts /etc/resolv.conf

containers with no problems keep working since the new /etc/resolv.conf is
essentially the same
containers with problems with cluster dns resolution, see the cluster dns
names being resolved.

So we dont go backwards with this approach..


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@smarterclayton
Copy link
Contributor

Scary. Seems like ndots is a pretty valuable thing for Docker in general - is there already an issue open to consider it?

On Jun 22, 2015, at 1:00 PM, Tim Hockin notifications@github.com wrote:

This might just work. It's a little scary, but it might work.
On Jun 22, 2015 9:20 AM, "Abhi Shah" notifications@github.com wrote:

We can do the following:

  1. kubelet starts the container with all the dns flags it uses today
    (additional nameserver, domain etc)
  2. Additionally kubelet adds the /etc/resolv.conf mount

At this point, functionality-wise, nothing has changed. The
/etc/resolv.conf is ignored. but for containers without the ndots problem,
life goes on as before.
containers with the ndots problem still face the same problems they face
today.

  1. kubelet umounts /etc/resolv.conf

containers with no problems keep working since the new /etc/resolv.conf is
essentially the same
containers with problems with cluster dns resolution, see the cluster dns
names being resolved.

So we dont go backwards with this approach..


Reply to this email directly or view it on GitHub
#10161 (comment)
.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member Author

thockin commented Jun 22, 2015

Yes, I have a sketch PR for them, too.
On Jun 22, 2015 10:27 AM, "Clayton Coleman" notifications@github.com
wrote:

Scary. Seems like ndots is a pretty valuable thing for Docker in general -
is there already an issue open to consider it?

On Jun 22, 2015, at 1:00 PM, Tim Hockin notifications@github.com
wrote:

This might just work. It's a little scary, but it might work.
On Jun 22, 2015 9:20 AM, "Abhi Shah" notifications@github.com wrote:

We can do the following:

  1. kubelet starts the container with all the dns flags it uses today
    (additional nameserver, domain etc)
  2. Additionally kubelet adds the /etc/resolv.conf mount

At this point, functionality-wise, nothing has changed. The
/etc/resolv.conf is ignored. but for containers without the ndots
problem,
life goes on as before.
containers with the ndots problem still face the same problems they
face
today.

  1. kubelet umounts /etc/resolv.conf

containers with no problems keep working since the new
/etc/resolv.conf is
essentially the same
containers with problems with cluster dns resolution, see the cluster
dns
names being resolved.

So we dont go backwards with this approach..


Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/10161#issuecomment-114169387>

.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

I am working on a proof-of-concept PR to do the mount/umount stuff. I started working on it today.

@alex-mohr alex-mohr added this to the v1.0 milestone Jun 23, 2015
@roberthbailey roberthbailey modified the milestones: v1.0-candidate, v1.0 Jun 23, 2015
@ArtfulCoder
Copy link
Contributor

#10241

Experimental Proof-of-concept

I tried the PR by launching a cluster with it and verifying the created pod-containers.
It seems to do the job.

@dchen1107 dchen1107 modified the milestones: v1.0, v1.0-candidate Jun 23, 2015
@dchen1107
Copy link
Member

I moved this back 1.0 because it broke with any docker image which has ubuntu as the base image for now, @thockin?

@thockin
Copy link
Member Author

thockin commented Jun 23, 2015

I will take a look later today.
On Jun 23, 2015 11:15 AM, "Dawn Chen" notifications@github.com wrote:

I moved this back 1.0 because it broke with any docker image which has
ubuntu as the base image for now, @thockin https://github.com/thockin?


Reply to this email directly or view it on GitHub
#10161 (comment)
.

@ArtfulCoder
Copy link
Contributor

There might also be a simpler option:
When docker creates a container with dns-flags passed it, docker itself creates a resolv.conf file.
kubelet can simply modify that resolv.conf file and add the ndots:3 option.

That way we never mount or umount anything...
I manually verified this and it works as well..

@thockin
Copy link
Member Author

thockin commented Jun 25, 2015

Fixed by #10266, I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

6 participants