New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to resolve hostname using `kubectl logs` #22770

Closed
ddysher opened this Issue Mar 10, 2016 · 24 comments

Comments

Projects
None yet
@ddysher
Copy link
Contributor

ddysher commented Mar 10, 2016

Using kubectl logs shows the following error:

Error from server: Get https://node-1:10250/containerLogs/kube-system/kibana-logging-v1-x1hr3/kibana-logging: dial tcp: lookup node-1: no such host

Kubernetes master uses nodeName to fetch logs from kubelet, which doesn't always resolve. There are two workarounds as I can think of:

  1. Manually add host entry in master's /etc/hosts, but this doesn't scale;
  2. Use hostname override to use raw IP address. This is better, but raw IP is hard to manage and we see this error #22109

Since master already have the mapping from nodeName to IPAddress, we can get rid of the problem by just using internal IP?

@ernoaapa

This comment has been minimized.

Copy link

ernoaapa commented Mar 22, 2016

👍 Yes this is causing issues when i'm setting up from scrach. I want to use public ip as node name but cannot because apiserver uses it for communication. Would be great that I could give kubelet flags: override-hostname=$public_ipv4 and node-ip=$private_ipv4 and it would use the InternalIP for communication and display the public ip as hostname. I assumed this would work but it didn't :(

@ernoaapa

This comment has been minimized.

Copy link

ernoaapa commented Mar 22, 2016

Some closely related discussion going on in here #22063

@Tha-Robert

This comment has been minimized.

Copy link

Tha-Robert commented Sep 9, 2016

Whats up with this ?

/T

@ernoaapa

This comment has been minimized.

Copy link

ernoaapa commented Sep 10, 2016

I guess this have not been fixed. We're still using the internal ip as hostname and in node labels we have 'public-ip' label where we have the actual public ip.
Really annoying if you want to for instance SSH to the server, first check nodes labels what the public ip is, then use that. Cannot use the node name because it's the internal ip 😞

@Aidamina

This comment has been minimized.

Copy link

Aidamina commented Sep 11, 2016

This issue got worse because of #31311 although #32050 might eleviate some of the pain

@nelsonenzo

This comment has been minimized.

Copy link

nelsonenzo commented Sep 23, 2016

Is there any known workaround for this? We just launched the new cluster in development and not being able to trail service logs is a big disappointment / near blocker for moving to production.

@ernoaapa

This comment has been minimized.

Copy link

ernoaapa commented Sep 25, 2016

@nelsonenzo we're still using the workaround what I explained in previous comment.

@lukemarsden

This comment has been minimized.

Copy link
Contributor

lukemarsden commented Sep 26, 2016

This is about to become a much bigger problem as it affects everyone using the new kubeadm tool: https://lukemarsden.github.io/docs/getting-started-guides/kubeadm/#limitations

@marun

This comment has been minimized.

Copy link
Member

marun commented Sep 27, 2016

I hit this problem a while ago and resorted to scripting periodic sync of node addresses (from get nodes) and /etc/hosts on the master:

https://github.com/marun/origin/blob/ozone/images/oz/master/sync-etc-hosts.sh

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Sep 28, 2016

@kubernetes/sig-cluster-lifecycle could this be a low-hanging fruit for 1.5?

@mkulke

This comment has been minimized.

Copy link
Contributor

mkulke commented Sep 28, 2016

i ran into this issue with homebrewn clusters on AWS & OpenStack and had to resort to workarounds. eventually i wrote a patch which untangles NodeName from the node resolver in logs/exec/port-forward: #25532

@justinsb

This comment has been minimized.

Copy link
Member

justinsb commented Sep 29, 2016

We should be using the nodeutil.GetNodeHostIP, not the NodeName. It is a long standing bug. I'll try to put together a PR.

justinsb added a commit to justinsb/kubernetes that referenced this issue Sep 29, 2016

Use nodeutil.GetHostIP consistently when talking to nodes
Most of our communications from apiserver -> nodes used
nodutil.GetNodeHostIP, but a few places didn't - and this
meant that the node name needed to be resolvable _and_ we needed
to populate valid IP addresses.

Fix the last few places that used the NodeName.

Issue kubernetes#18525
Issue kubernetes#9451
Issue kubernetes#9728
Issue kubernetes#17643
Issue kubernetes#11543
Issue kubernetes#22063
Issue kubernetes#2462
Issue kubernetes#22109
Issue kubernetes#22770
Issue kubernetes#32286
@justinsb

This comment has been minimized.

Copy link
Member

justinsb commented Sep 29, 2016

PR in #33718

chrislovecnm added a commit to chrislovecnm/kubernetes that referenced this issue Oct 10, 2016

Use nodeutil.GetHostIP consistently when talking to nodes
Most of our communications from apiserver -> nodes used
nodutil.GetNodeHostIP, but a few places didn't - and this
meant that the node name needed to be resolvable _and_ we needed
to populate valid IP addresses.

Fix the last few places that used the NodeName.

Issue kubernetes#18525
Issue kubernetes#9451
Issue kubernetes#9728
Issue kubernetes#17643
Issue kubernetes#11543
Issue kubernetes#22063
Issue kubernetes#2462
Issue kubernetes#22109
Issue kubernetes#22770
Issue kubernetes#32286
@markthink

This comment has been minimized.

Copy link
Member

markthink commented Oct 13, 2016

I also encountered a similar problem, weave network Kai not.

[root@master schema]# kubectl get po --namespace=kube-system -o wide
NAME                             READY     STATUS              RESTARTS   AGE       IP          NODE
etcd-master                      1/1       Running             3          10h       10.0.2.15   master
kube-apiserver-master            1/1       Running             3          10h       10.0.2.15   master
kube-controller-manager-master   1/1       Running             3          10h       10.0.2.15   master
kube-discovery-982812725-mtqfh   1/1       Running             0          18m       10.0.2.15   master
kube-dns-2247936740-3k2gn        0/3       ContainerCreating   0          5m        <none>      node1
kube-proxy-amd64-izxxe           1/1       Running             3          10h       10.0.2.15   master
kube-proxy-amd64-z8nj8           1/1       Running             1          10h       10.0.2.15   node1
kube-scheduler-master            1/1       Running             3          10h       10.0.2.15   master
weave-net-7m5jm                  2/2       Running             8          10h       10.0.2.15   master
weave-net-ed15l                  1/2       CrashLoopBackOff    1          20s       10.0.2.15   node1
[root@node1 schema]# kubectl logs -f weave-net-ed15l --namespace=kube-system -c weave
Error from server: Get https://node1:10250/containerLogs/kube-system/weave-net-ed15l/weave?follow=true: dial tcp: lookup node1 on 10.0.2.3:53: no such host

euank added a commit to euank/kubernetes that referenced this issue Oct 14, 2016

Use nodeutil.GetHostIP consistently when talking to nodes
Most of our communications from apiserver -> nodes used
nodutil.GetNodeHostIP, but a few places didn't - and this
meant that the node name needed to be resolvable _and_ we needed
to populate valid IP addresses.

Fix the last few places that used the NodeName.

Issue kubernetes#18525
Issue kubernetes#9451
Issue kubernetes#9728
Issue kubernetes#17643
Issue kubernetes#11543
Issue kubernetes#22063
Issue kubernetes#2462
Issue kubernetes#22109
Issue kubernetes#22770
Issue kubernetes#32286

chadswen added a commit to chadswen/kargo that referenced this issue Oct 17, 2016

Change the kubelet --hostname-override flag to use the ansible_hostna…
…me variable which should be more consistent with the value required by cloud providers

Add inventory_hostname_short alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>

chadswen added a commit to chadswen/kargo that referenced this issue Oct 18, 2016

Change the kubelet --hostname-override flag to use the ansible_hostna…
…me variable which should be more consistent with the value required by cloud providers

Add ansible_hostname alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>

chadswen added a commit to chadswen/kargo that referenced this issue Oct 18, 2016

Change the kubelet --hostname-override flag to use the ansible_hostna…
…me variable which should be more consistent with the value required by cloud providers

Add ansible_hostname alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>

chadswen added a commit to chadswen/kargo that referenced this issue Oct 18, 2016

Change the kubelet --hostname-override flag to use the ansible_hostna…
…me variable which should be more consistent with the value required by cloud providers

Add ansible_hostname alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>

chadswen added a commit to chadswen/kargo that referenced this issue Oct 18, 2016

Hostname alias fixes
Change the kubelet --hostname-override flag to use the ansible_hostname variable which should be more consistent with the value required by cloud providers

Add ansible_hostname alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>
@buchireddy

This comment has been minimized.

Copy link

buchireddy commented Nov 9, 2016

It seems like the same issue could impact logging into an existing pod or running one off pods too (when using kubeadm to bring up the cluster of course).

This is the error message I'm seeing when I try to login to an existing pod.

root@ip-172-31-14-223:~# kubectl exec -it bash-2739707304-m14sw -- bash
Error from server: dial tcp 172.31.14.224:10250: i/o timeout

This is the error message when I tried kubectl run

root@ip-172-31-14-223:~# kubectl run -it bash --image=buchireddy/docker-bash -- bash
Waiting for pod default/bash-2739707304-m14sw to be running, status is Pending, pod ready: false
Waiting for pod default/bash-2739707304-m14sw to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
                                                      Error attaching, falling back to logs: dial tcp 172.31.14.224:10250: i/o timeout

Error from server: Get https://ip-172-31-14-224:10250/containerLogs/default/bash-2739707304-m14sw/bash: dial tcp 172.31.14.224:10250: i/o timeout
@hectorj2f

This comment has been minimized.

Copy link

hectorj2f commented Nov 21, 2016

We also have a similar problem. We defined a specific hostname (not resolvable) to get a customized node name. We also specify the node-ip to a resolvable IP address in the kubelet parameters. So we expected the master api to use that node-ip instead of the hostname or node name.

Surprisingly for us, kubernetes master uses the hostname to communicate with the kubelet instead of the node ip. So it cannot fetch the logs of our apps.

@derailed

This comment has been minimized.

Copy link

derailed commented Nov 26, 2016

Having the same issue here with timeouts using either kubectl logs/execs on pods. Cluster was hydrated via kubeadm. I understand I can alway ssh to instance and get logs/exec but this is a royal pain. Any update on this issue?

@mkulke

This comment has been minimized.

Copy link
Contributor

mkulke commented Nov 27, 2016

I haven't had the time to test an 1.5 cluster yet, but from looking at the code, i guess what works now is:

  • Set something master-resolvable via --hostname-override on the kubelet.
  • Have the cloud-provider rename the node to something non-resolvable.

@hectorj2f By default the node uses the node's hostname, so when you use --hostname-override with some non-resolvable display name (e.g. "my-node-1"), the master will not be able to talk to the kubelet, because by default the hostname will be preferred as means to talk to the kubelet.

however in 1.5 there is also --kubelet-preferred-address-types (https://github.com/kubernetes/kubernetes/blob/release-1.5/cmd/kube-apiserver/app/options/options.go#L112), which you can use to specify a priority list of comm options (Hostname, InternalIP, ExternalIP, LegacyHostIP).

The flag is not documented yet, but I think this should solve most issues around apiserver->kubelet comm.

@bonovoxly

This comment has been minimized.

Copy link

bonovoxly commented Dec 1, 2016

@mkulke minor correction (because I just ran into it), use --kubelet-preferred-address-types.

@xmik

This comment has been minimized.

Copy link

xmik commented Dec 10, 2016

I confirm that kubectl logs and kubectl exec work when using Kubernetes 1.5.0-beta3 and having set --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP for apiserver.

fouadsemaan added a commit to fouadsemaan/ansible-kubernetes-cluster that referenced this issue Jan 23, 2017

andrewrothstein added a commit to andrewrothstein/ansible-kubernetes-cluster that referenced this issue Jan 25, 2017

kubernetes/kubernetes#22770 couldn't fetch log from dashboard or run …
…kubectl exec (#8)

* couldn't fetch log from dashboard or run kubectl exec:  kubernetes/kubernetes#22770

* put kubelet preferred address types in separate default property
@vchan2002

This comment has been minimized.

Copy link

vchan2002 commented Feb 7, 2017

I haven't had the time to test an 1.5 cluster yet, but from looking at the code, i guess what works now is:

Set something master-resolvable via --hostname-override on the kubelet.
Have the cloud-provider rename the node to something non-resolvable.

That is the exact problem I ran into. We use AWS, but we don't use the default Amazon internal DNS resolver for our VPCs. the --cloud-provider flag is basically clobbering all attempts that we make to rename the hostname that our DNS can resolve, and instead changing it to the default {name}.ec2.internal hostname that we cannot resolve without making some major changes to our existing infrastructure.

We already have an existing infrastructure that we have to adhere to, so having the the ability to name our own hosts WHILE taking advantage of other tools that having the --cloud-provider flag gives us (such as setting up the entry in the route table per node) is paramount for us.

chadswen added a commit to chadswen/kargo that referenced this issue Jul 31, 2017

Hostname alias fixes
Change the kubelet --hostname-override flag to use the ansible_hostname variable which should be more consistent with the value required by cloud providers

Add ansible_hostname alias to /etc/hosts when it is different from inventory_hostname to overcome node name limitations see kubernetes/kubernetes#22770

Signed-off-by: Chad Swenson <chadswen@gmail.com>
@estechnical

This comment has been minimized.

Copy link

estechnical commented Sep 15, 2017

This issue also affects me, I have just set up a kubernetes cluster with MAAS + Juju without using conjure-up, because conjure-up was failing to place the services correctly on our machines.

This issue conjure-up/conjure-up#1150 details how I got where I am now...

When I try to use kubectl logs, it clearly shows that the dns request is going to google DNS:

https://jump:10250/containerLogs/default/kubernetes-bootcamp-2457653786-g7msl/kubernetes-bootcamp: dial tcp: lookup jump on 8.8.4.4:53: no such host

Is this the same issue or does this need a new one to be opened?

@estechnical

This comment has been minimized.

Copy link

estechnical commented Oct 2, 2017

Sorry for the really long delay!
I can confirm that adding the names of the our nodes to /etc/hosts on the kubernetes master resolves this problem for us too :)

@luxas

This comment has been minimized.

Copy link
Member

luxas commented Oct 19, 2017

I'm closing this as this feature has been around for quite some time now.

@luxas luxas closed this Oct 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment