Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Kubernetes Core via Canonical Ubuntu localinstall DNS lookup of services does not work #286
Comments
|
Hi @jayeshnazre Can you help me reproduce this? I understand you start with the deployment of kubernetes-core in a localhost (lxd) based environment. With this setup all units are deployed in separate lxc containers within the VM you use. From there you deploy some services. Is it possible to share the yamls you apply or the steps you follow to deploy these services so as to reproduce your setup? As a quick test can you deploy an ssh-demo pod and try to resolve the names of other services. For example:
Inside master:
Inside shell-demo:
Thanks |
jayeshnazre
commented
May 15, 2017
|
Thanks for responding. The steps I followed are very simple That’s it. Everything was installed successfully behind the scene but the DNS logical service name lookup does not work. Everything works via direct IP. A few things to be aware of |
|
After step 3 you have an Ubuntu VM on VMWare that hosts LXC containers running kubernetes. Inside the LXC containers is where the docker containers spawn. You mention "DNS logical service name lookup does not work"; since there are a lot of levels of nested machines (Docker inside LXC inside a VM inside your physical host) can you please give me the command you are issuing to resolve a logical service name. Where are you issuing this command? What output were you expecting and what output did you get? You are supposed to load docker containers using docker primitives within the kubernetes workers. Kubernetes manages the containers itself so it needs to be aware of what is running at any point in time. You should be using the pod/service/deployment yaml description files to spawn your services. In this way the KubeDNS service running within kubernetes gets populated with the right entries. The work we are doing at Canonical in the Kubernetes distribution indeed abstracts the complexity of setting up and managing the cluster. There might be use cases we are not covering and we are willing to work on them with the community (with you). Before you go and take on the Kubernetes beast by yourself, here is what you can do: join us on IRC freenode channel #juju and ping either me (user: kjackal european time) or @chuckbutler (user lazyPower us time) so we can have a real time interaction. |
|
@Cynerva I believe this is related to the discovery that LXD mandates we need to move kube-proxy from kernel-space (iptables) into user-space proxy yeah? |
|
@chuckbutler This does sound like the same issue that setting proxy-mode=userspace fixed, yeah. I don't know if it's a universal problem in LXD deployments though, haven't tried to repro myself. |
chuckbutler
added
kind/bug
status/needs-feedback
labels
Jun 13, 2017
|
@jayeshnazre Can you confirm that by adjusting the proxy-mode flag on the kube-proxy unit files you're able to resolve DNS correctly and communicate between pods using the dns name? we've seen where service VIP's misbehave running in iptables mode. This leads me to believe we need to expose this as a tunable option on the worker to explicitly set the proxy-mode on the worker charm, regarldess of resolution here. There are scenarios where one works and the other does not, and our sensible defaults are breaking a class of deployment. |
chuckbutler
added
the
area/kubernetes-worker
label
Jun 21, 2017
thomas-riccardi
commented
Jul 10, 2017
|
I have the same issue using a brand new ubuntu 16.04 VM using vagrant box 'bento/ubuntu-16.04' (v2.3.7); and following https://kubernetes.io/docs/getting-started-guides/ubuntu/; with conjure-up, localhost provider, cdk with default deployment: 3 kubernetes-workers, etc..
In this case it fails on the pod scheduled on the same node as the one Digging deeper with tshark (logs from an older vm):
Here the DNS query works, but the tcp connection with the VIP service doesn't:
The DNAT works for the SYN packet, but not for the response SYN-ACK: the source IP is the pod's IP, not the service VIP; it is then dropped by the network stack. I reproduced some issues each time I re-created the vagrant VM (5-6 times now); but sometimes it's the kube-dns service that fails, sometimes the kubernetes-bootcamp one; sometimes some pods fail systematically, sometimes it fails once every ~5-10 requests... Reproduced with channel 1.6/stable and 1.7/stable, with linux 4.4 and 4.8. Adding Should we open an issue on kubernetes? Is the iptables proxy mode expected to always work? |
|
Since we've only seen reports of this on LXD deployments, I'm suspicious of some sort of conflict happening with iptables between the LXD containers. Hard to say since I can't repro. We could update the charm to set proxy-mode=userspace on LXD by default. AFAIK it's not as performant, but as long as it still performs reasonably then I think that should be fine. |
thomas-riccardi
commented
Jul 11, 2017
|
@Cynerva Is my scenario not enough for the repro? It is all running in a VM; it should be sufficiently isolated (hopefully the VM network setup and the virtualization technology should not impact internal routing; I use vagrant with virtualbox 5.0 and tested both with VM network bridged and NATed). |
jayeshnazre commentedMay 12, 2017
Hi
I am new to Canonical Kubernetes, Juju and Conjure-up. But I followed the instructions to the tee to install Kubernetes-Core and everything got installed correctly on my local VM machine running Ubuntu 16.04.2. However service names are not resolving to IP, nor can I access the service cluster IP, this is internally from within the PODS, I got only one worker node as I am using the kubernetes-core. NOTE: My kube-dns pod is running and I am not able to identify whats wrong. Can someone help me with the issue. I have exposed the service as NodePort and the only way I can get to the services is via the nodeport and worker nodes IP. But I want to access the service name from within my cluster.
NOTE: All my services are in the default namespace so this is just internal service1 to service2 communication via the logical name is what I am trying to make it work (service1 and service2 are just used here as an example to explain my problem). My VM is on my VMWare workstation 12 software.