New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet cannot access rancher-metadata #7160

Closed
Flowman opened this Issue Dec 21, 2016 · 11 comments

Comments

Projects
None yet
6 participants
@Flowman

Flowman commented Dec 21, 2016

Rancher Version:
1.2.1
Docker Version:
1.12.5
OS and where are the hosts located? (cloud, bare metal, etc):
Ubuntu 16.04 LTS
Setup Details: (single node rancher vs. HA rancher, internal DB vs. external DB)
Single nod rancher
Environment Type: (Cattle/Kubernetes/Swarm/Mesos)
Kubernetes 1.4.6
Steps to Reproduce:
Create a new environment
Add a host to environment
Results:
Everything is green but 'kubectl get nodes' returns null.

Looks like the kubelet service cannot access internal dns so is just loops on:

21/12/2016 18:29:39+ curl -s -f http://rancher-metadata/2015-12-19/stacks/Kubernetes/services/kubernetes/uuid
21/12/2016 18:29:39Waiting for metadata
21/12/2016 18:29:39+ echo Waiting for metadata
21/12/2016 18:29:39+ sleep 1
21/12/2016 18:29:40+ curl -s -f http://rancher-metadata/2015-12-19/stacks/Kubernetes/services/kubernetes/uuid
21/12/2016 18:29:40Waiting for metadata

Tried to do a nslookup on
Expected:
The environment to start up and work.

@Flowman

This comment has been minimized.

Show comment
Hide comment
@Flowman

Flowman Dec 23, 2016

So I get this issue when I run it in Azure. A specify the agent ip on install to the local ip address of the vm.
The weird thing is that all the other services like proxy and scheduler can find the internal dns for rancher-metadata.

If I run it locally on my virtualbox everything works fine.

Flowman commented Dec 23, 2016

So I get this issue when I run it in Azure. A specify the agent ip on install to the local ip address of the vm.
The weird thing is that all the other services like proxy and scheduler can find the internal dns for rancher-metadata.

If I run it locally on my virtualbox everything works fine.

@Flowman

This comment has been minimized.

Show comment
Hide comment
@Flowman

Flowman Dec 23, 2016

After some more digging dns works but cannot get to the host.
image

Example of the working proxy container.
image

Flowman commented Dec 23, 2016

After some more digging dns works but cannot get to the host.
image

Example of the working proxy container.
image

@Flowman Flowman changed the title from kubelet cannot access internal dns to kubelet cannot access rancher-metadata Dec 23, 2016

@janeczku

This comment has been minimized.

Show comment
Hide comment
@janeczku

janeczku Dec 24, 2016

Contributor

@Flowman in the kubelet container please run 'nslookup rancher-metadata' (without the rancher.internal part) and paste the content of resolv.conf.

Contributor

janeczku commented Dec 24, 2016

@Flowman in the kubelet container please run 'nslookup rancher-metadata' (without the rancher.internal part) and paste the content of resolv.conf.

@Flowman

This comment has been minimized.

Show comment
Hide comment
@Flowman

Flowman Dec 28, 2016

@janeczku here is the info you wanted.

image

I have updated to rancher 1.2.2 and still the same issue.

Flowman commented Dec 28, 2016

@janeczku here is the info you wanted.

image

I have updated to rancher 1.2.2 and still the same issue.

@antmanler

This comment has been minimized.

Show comment
Hide comment
@antmanler

antmanler Dec 30, 2016

Same problem.

It seems that curl used the nameservers in /run/resolvconf/resolv.conf, since it is mounted from host's /run

antmanler commented Dec 30, 2016

Same problem.

It seems that curl used the nameservers in /run/resolvconf/resolv.conf, since it is mounted from host's /run

@niusmallnan

This comment has been minimized.

Show comment
Hide comment
@niusmallnan

niusmallnan Jan 1, 2017

Member

I have tested this issue on AWS, my ec2 OS information:

root@ip-172-31-12-49:~# uname -a
Linux ip-172-31-12-49 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@ip-172-31-12-49:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.2 LTS
Release:	14.04
Codename:	trusty

everything is ok.

Tested on AliyunCloud, OS information:

root@iZ2zej1dt14chgmm82oy4iZ:~# uname -a
Linux iZ2zej1dt14chgmm82oy4iZ 4.4.0-53-generic #74~14.04.1-Ubuntu SMP Fri Dec 2 03:43:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@iZ2zej1dt14chgmm82oy4iZ:~# lsb_release -a
LSB Version:	core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.5 LTS
Release:	14.04
Codename:	trusty

everything is ok.

I think this might be a problem of OS compatibility.

Member

niusmallnan commented Jan 1, 2017

I have tested this issue on AWS, my ec2 OS information:

root@ip-172-31-12-49:~# uname -a
Linux ip-172-31-12-49 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@ip-172-31-12-49:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.2 LTS
Release:	14.04
Codename:	trusty

everything is ok.

Tested on AliyunCloud, OS information:

root@iZ2zej1dt14chgmm82oy4iZ:~# uname -a
Linux iZ2zej1dt14chgmm82oy4iZ 4.4.0-53-generic #74~14.04.1-Ubuntu SMP Fri Dec 2 03:43:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@iZ2zej1dt14chgmm82oy4iZ:~# lsb_release -a
LSB Version:	core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.5 LTS
Release:	14.04
Codename:	trusty

everything is ok.

I think this might be a problem of OS compatibility.

@zzp8164

This comment has been minimized.

Show comment
Hide comment
@zzp8164

zzp8164 Feb 4, 2017

Same problem , is there any way to walk around?

zzp8164 commented Feb 4, 2017

Same problem , is there any way to walk around?

@zzp8164

This comment has been minimized.

Show comment
Hide comment
@zzp8164

zzp8164 Feb 4, 2017

Temporarily walk around by pasting /run/resolvconf/resolv.conf in the host machine with /etc/resolv.conf in the container.

zzp8164 commented Feb 4, 2017

Temporarily walk around by pasting /run/resolvconf/resolv.conf in the host machine with /etc/resolv.conf in the container.

@zzp8164

This comment has been minimized.

Show comment
Hide comment
@zzp8164

zzp8164 Feb 28, 2017

Still same problem after updating to Rancher v1.4.1

zzp8164 commented Feb 28, 2017

Still same problem after updating to Rancher v1.4.1

@niusmallnan

This comment has been minimized.

Show comment
Hide comment
@niusmallnan

niusmallnan Mar 3, 2017

Member

@ALL
I think I have find the essential reason. #8038

Member

niusmallnan commented Mar 3, 2017

@ALL
I think I have find the essential reason. #8038

@aemneina

This comment has been minimized.

Show comment
Hide comment
@aemneina

aemneina May 17, 2017

Should be resolved with the latest 1.5. Closing this out. If anyone hits this in a recent version of 1.5, please holler.

aemneina commented May 17, 2017

Should be resolved with the latest 1.5. Closing this out. If anyone hits this in a recent version of 1.5, please holler.

@aemneina aemneina closed this May 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment