The pods of kube-flannel is in ERROR state in agent node after installing cluster by kubeadm #39701

Open
maweina opened this Issue Jan 11, 2017 · 1 comment

Projects

None yet

1 participant

@maweina
maweina commented Jan 11, 2017

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):
request

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):
kubadm, flannel

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:52:01Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: vagrant + vritualbox in Windows 10
  • OS (e.g. from /etc/os-release): centos 7
  • Kernel (e.g. uname -a): 3.10.0-327.4.5.el7.x86_64
  • Install tools: kubeadmin
  • Others:

What happened:
In step (4/4) of Joining your nodes of http://kubernetes.io/docs/getting-started-guides/kubeadm/, after node joined cluster, the pods of kube-flannel in the new added agent node is in ERROR state.

What you expected to happen:
The pods of kube-flannel is in running state.

How to reproduce it (as minimally and precisely as possible):
Follow instruction of http://kubernetes.io/docs/getting-started-guides/kubeadm/ and choose flannel as network. In step 2, specifying both --api-advertise-addresses= and --pod-network-cidr=10.244.0.0/16 when executing "kubeadm init". In step 4, editing kube-flannel.yml by changing flannel command to be " command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]", as enp0s8 is the right network interface in vagrant environment.

Anything else do we need to know:
The docker logs of the failed container shows

E0110 23:48:52.242726 1 main.go:127] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-xbbqs': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-xbbqs: dial tcp 10.96.0.1:443: i/o timeout

10.96.0.1 is the service ip of kube-api-service.

In the new added agent node, the related rules in iptable is configured as

Chain KUBE-SEP-MUTNCW54Z2U4XKXF (2 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- k8snode1 anywhere /* default/kubernetes:https /
2 DNAT tcp -- anywhere anywhere /
default/kubernetes:https */ recent: SET name: KUBE-SEP-MUTNCW54Z2U4XKXF side: source mask: 255.255.255.255 tcp to:10.211.56.101:6443

k8snode1 is the hostname of master and 10.211.56.101 is the ip address of master node.

In the new added agent node, try to use nc to connect the 10.96.0.1:443

"nc -4 -v 10.96.0.1 443" failed
"nc -4 -v -s 10.211.56.102 10.96.0.1 443" works. The 10.211.56.102 is the ip address of network interface of enp0s8 in new added agent node.

Just wondering why flanneld does not bind 10.211.56.102 when communicating with kube-api-server after "iface=enp0s8" is specified in command line.

I am afraid in Chain KUBE-SEP-MUTNCW54Z2U4XKXF, if the rule of "KUBE-MARK-MASQ all -- k8snode1 anywhere " is changed to "KUBE-MARK-MASQ all -- anywhere anywhere", the issue maybe resolved. However I did not try it yet.

@maweina
maweina commented Jan 11, 2017

After checking the codes of flannel and kubeclient, I found "--iface" is not used by flanneld to communicate with kube-apiservice. Actually caller of kubeclient is not able to specify source ip address.

After hardcoding the ip address of kube-apiservice in kube-flannel.yml as follows, the issue is worked around. However we still need an official solution to install kubenetes by kubeadm in vagrant+virtualbox environment.

...
containers:
- name: kube-flannel
image: quay.io/coreos/flannel-git:v0.6.1-28-g5dde68d-amd64
command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: KUBERNETES_SERVICE_HOST
value: "10.211.56.101" #ip address of the host where kube-apiservice is running
- name: KUBERNETES_SERVICE_PORT
value: "6443"
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment