Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm blocks waiting for 'control plane' #33544

Closed
sebgoa opened this issue Sep 27, 2016 · 66 comments
Closed

kubeadm blocks waiting for 'control plane' #33544

sebgoa opened this issue Sep 27, 2016 · 66 comments

Comments

@sebgoa
Copy link
Contributor

sebgoa commented Sep 27, 2016

Hi @kubernetes/sig-cluster-lifecycle

I tried to follow the docs for kubeadm on centOS 7.1.

It seems that the kubeadm init blocks waiting for 'control plane to become ready' even though all containers are running.

# kubeadm init --token foobar.1234
<util/tokens> validating provided token
<master/tokens> accepted provided token
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready

here are the running containers on the same master machine:

$ sudo docker ps
CONTAINER ID        IMAGE                                                           COMMAND                  CREATED             STATUS              PORTS               NAMES
30aff4f98753        gcr.io/google_containers/kube-apiserver-amd64:v1.4.0            "/usr/local/bin/kube-"   3 minutes ago       Up 3 minutes                            k8s_kube-apiserver.c44dda3f_kube-apiserver-k8ss-head_kube-system_6b83c87a9bf5c380c6f948f428b23dd1_408af885
8fd1842776ab        gcr.io/google_containers/kube-controller-manager-amd64:v1.4.0   "/usr/local/bin/kube-"   3 minutes ago       Up 3 minutes                            k8s_kube-controller-manager.a2978680_kube-controller-manager-k8ss-head_kube-system_5f805ed49f6fd9f0640be470e3dea2a2_7ac41d83
32b7bfb55dc0        gcr.io/google_containers/kube-scheduler-amd64:v1.4.0            "/usr/local/bin/kube-"   3 minutes ago       Up 3 minutes                            k8s_kube-scheduler.1b5cde04_kube-scheduler-k8ss-head_kube-system_586d16be4ecaac95b0162c5d11921019_0ca14012
8a1797fdb1df        gcr.io/google_containers/etcd-amd64:2.2.5                       "etcd --listen-client"   8 minutes ago       Up 8 minutes                            k8s_etcd.4ffa9846_etcd-k8ss-head_kube-system_42857e4bd57d261fc438bcb2a87572b9_f1b219d3
292bcafb3316        gcr.io/google_containers/pause-amd64:3.0                        "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD.d8dbe16c_kube-controller-manager-k8ss-head_kube-system_5f805ed49f6fd9f0640be470e3dea2a2_fe9592ab
ab929dd920a2        gcr.io/google_containers/pause-amd64:3.0                        "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD.d8dbe16c_kube-apiserver-k8ss-head_kube-system_6b83c87a9bf5c380c6f948f428b23dd1_c93e3a3b
71c28763aeab        gcr.io/google_containers/pause-amd64:3.0                        "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD.d8dbe16c_kube-scheduler-k8ss-head_kube-system_586d16be4ecaac95b0162c5d11921019_eb12a865
615cb42e0108        gcr.io/google_containers/pause-amd64:3.0                        "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD.d8dbe16c_etcd-k8ss-head_kube-system_42857e4bd57d261fc438bcb2a87572b9_891fc5db

I tried to join a node but I get a connection refused error, even though there is no firewall...

# kubeadm join --token foobar.1234 <master_ip>
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://185.19.30.178:9898/cluster-info/v1/?token-id=foobar"
error: <node/discovery> failed to request cluster info [Get http://MASTER_IP:9898/cluster-info/v1/?token-id=foobar: dial tcp MASTER_IP:9898: getsockopt: connection refused]

and now I am actually wondering if the init is blocking waiting for nodes to join. According to the docs it is not blocking, but the logs of kubeadm seems to indicate that it is.

@k8s-github-robot k8s-github-robot added area/nodecontroller sig/node Categorizes an issue or PR as relevant to SIG Node. labels Sep 27, 2016
@DaspawnW
Copy link

DaspawnW commented Sep 27, 2016

Same issue for me on an aws installation, but I can't see any docker containers running.
Some Informations are here:
Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-38-generic x86_64)
Using http_proxy and https_proxy

export https_proxy=http://<proxy>:<port>
export http_proxy=http://<proxy>:<port>
kubeadm init --cloud-provider aws

Looked at the logs of the apiServer. It returns with an exception:

I0927 11:44:47.425374       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (793.43µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:47.427858       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.682203ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:47.606685       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57328: remote error: bad certificate
I0927 11:44:47.722809       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57330: remote error: bad certificate
I0927 11:44:47.728099       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57332: remote error: bad certificate
I0927 11:44:48.251368       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57334: remote error: bad certificate
I0927 11:44:48.256871       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57336: remote error: bad certificate
I0927 11:44:48.262479       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57338: remote error: bad certificate
I0927 11:44:48.267460       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57340: remote error: bad certificate
I0927 11:44:48.608406       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57342: remote error: bad certificate
I0927 11:44:48.724428       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57344: remote error: bad certificate
I0927 11:44:48.729680       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57346: remote error: bad certificate
I0927 11:44:48.777612       1 handlers.go:162] GET /healthz: (39.187µs) 200 [[Go-http-client/1.1] 127.0.0.1:49808]
I0927 11:44:49.429761       1 handlers.go:162] G10.10.10.10ET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (762.498µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:49.432267       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (2.070905ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:49.614084       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57354: remote error: bad certificate
I0927 11:44:49.727405       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57356: remote error: bad certificate
I0927 11:44:49.732888       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57358: remote error: bad certificate
I0927 11:44:50.080279       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57360: remote error: bad certificate
I0927 11:44:50.085570       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57362: remote error: bad certificate
I0927 11:44:50.617384       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57364: remote error: bad certificate
I0927 11:44:50.730144       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57366: remote error: bad certificate
I0927 11:44:50.735525       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57368: remote error: bad certificate
I0927 11:44:51.433824       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (769.066µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:51.436359       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.713977ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:51.620964       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57370: remote error: bad certificate
I0927 11:44:51.731724       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57372: remote error: bad certificate
I0927 11:44:51.761983       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57374: remote error: bad certificate
I0927 11:44:52.622487       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57376: remote error: bad certificate
I0927 11:44:52.732927       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57378: remote error: bad certificate
I0927 11:44:52.762908       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57380: remote error: bad certificate
I0927 11:44:53.438270       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (805.346µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:53.440909       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.82773ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:53.627293       1 logs.go:41] http: TLS handshake error from 10.10.10.10:57382: remote error: bad certificate

@yoojinl
Copy link

yoojinl commented Sep 27, 2016

@sebgoa Looks similar to #33541, do you have SELinux enabled?
Try to run docker ps -a | grep discovery, get id of kube-discovery container run docker logs <id> to see, if there is permissions denied error for /tmp/secret directory.

@oz123
Copy link
Contributor

oz123 commented Sep 27, 2016

@RustyRobot , disabling SELINUX on Ubuntu 16.04 does solve the problem of hanging.

oz123 added a commit to oz123/kubernetes.github.io that referenced this issue Sep 27, 2016
@sebgoa
Copy link
Contributor Author

sebgoa commented Sep 27, 2016

Ok disabling selinux got me further, the kubeadm init finished. But now there is nothing listening on 9898.

What component is supposed to be listening on that port for cluster joins ?

@sebgoa
Copy link
Contributor Author

sebgoa commented Sep 27, 2016

ok so the discover port is using a hostPort on 9898.

logs on that pod return this:

$ kubectl logs kube-discovery-1971138125-yry3x --namespace=kube-system
Error from server: Get https://kube-head:10250/containerLogs/kube-system/kube-discovery-1971138125-yry3x/kube-discovery: dial tcp: lookup kube-head on 8.8.8.8:53: no such host

I am following the docs

The DNS pod is not starting:

Events:
  FirstSeen LastSeen    Count   From            SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----            -------------   --------    ------      -------
  27m       27m     1   {default-scheduler }            Normal      Scheduled   Successfully assigned kube-dns-2247936740-igptf to kube-head
  27m       3s      662 {kubelet kube-head}         Warning     FailedSync  Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2247936740-igptf_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-2247936740-igptf_kube-system(00cf8b74-84c2-11e6-9dfa-061eca000139)\" using network plugins \"cni\": cni config unintialized; Skipping pod"

@sebgoa
Copy link
Contributor Author

sebgoa commented Sep 27, 2016

@errordeveloper looks like this might be right up your alley..

@lukemarsden
Copy link
Contributor

@sebgoa can you try starting from scratch, following the instructions at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/ please?

@oz123
Copy link
Contributor

oz123 commented Sep 27, 2016

@lukemarsden, I followed the instructions you posted, and it seems that systemd is immediately starting kubelet:

# apt-get install -y kubelet kubeadm kubectl kubernetes-cni
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  kubeadm kubectl kubelet kubernetes-cni
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/40.9 MB of archives.
After this operation, 328 MB of additional disk space will be used.
Selecting previously unselected package kubernetes-cni.
(Reading database ... 69111 files and directories currently installed.)
Preparing to unpack .../kubernetes-cni_0.3.0.1-07a8a2-00_amd64.deb ...
Unpacking kubernetes-cni (0.3.0.1-07a8a2-00) ...
Selecting previously unselected package kubelet.
Preparing to unpack .../kubelet_1.4.0-00_amd64.deb ...
Unpacking kubelet (1.4.0-00) ...
Selecting previously unselected package kubectl.
Preparing to unpack .../kubectl_1.4.0-00_amd64.deb ...
Unpacking kubectl (1.4.0-00) ...
Selecting previously unselected package kubeadm.
Preparing to unpack .../kubeadm_1.5.0-alpha.0-1495-g1e7fa1f-00_amd64.deb ...
Unpacking kubeadm (1.5.0-alpha.0-1495-g1e7fa1f-00) ...
Setting up kubernetes-cni (0.3.0.1-07a8a2-00) ...
Setting up kubelet (1.4.0-00) ...
Setting up kubectl (1.4.0-00) ...
Setting up kubeadm (1.5.0-alpha.0-1495-g1e7fa1f-00) ...
root@saltmaster:/home/vagrant# kubeadm init --api-advertise-addresses 172.16.80.80
<master/tokens> generated token: "99f2d4.26fdd8fe96143456"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
error: <util/kubeconfig> failed to create "/etc/kubernetes/kubelet.conf", it already exists [open /etc/kubernetes/kubelet.conf: file exists]

@yoojinl
Copy link

yoojinl commented Sep 27, 2016

@oz123 If you already did kubeadm init, you need to start from scratch i.e. remove /etc/kuberentes and /var/lib/etcd directories. We have plans to introduce --reset flag in the future, in order to do it automatically.

@benmathews
Copy link

There is a uninstall script referenced at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/. After running it, my init ran correctly again.

systemctl stop kubelet;
docker rm -f $(docker ps -q); mount | grep "/var/lib/kubelet/*" | awk '{print $3}' | xargs umount 1>/dev/null 2>/dev/null;
rm -rf /var/lib/kubelet /etc/kubernetes /var/lib/etcd /etc/cni;
ip link set cbr0 down; ip link del cbr0;
ip link set cni0 down; ip link del cni0;
systemctl start kubelet

@errordeveloper
Copy link
Member

I think this can be closed now, as soon as new packages become available.

On Tue, 27 Sep 2016, 18:33 Ben Mathews, notifications@github.com wrote:

There is a uninstall script referenced at
http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/.
After running it, my init ran correctly again.

systemctl stop kubelet;
docker rm -f $(docker ps -q); mount | grep "/var/lib/kubelet/*" | awk '{print $3}' | xargs umount 1>/dev/null 2>/dev/null;
rm -rf /var/lib/kubelet /etc/kubernetes /var/lib/etcd /etc/cni;
ip link set cbr0 down; ip link del cbr0;
ip link set cni0 down; ip link del cni0;
systemctl start kubelet


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#33544 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWS6jj3IYKa-JLyYAyuvCarg_G8xgKks5quVNRgaJpZM4KHcRI
.

@sebgoa
Copy link
Contributor Author

sebgoa commented Sep 27, 2016

can you give me time to test again tomorrow.

@errordeveloper
Copy link
Member

Sure!

On Tue, 27 Sep 2016, 22:10 sebgoa, notifications@github.com wrote:

can you give me time to test again tomorrow.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#33544 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWSwfImH3weymFHmjQCpQTtvAkd-I6ks5quYZPgaJpZM4KHcRI
.

@errordeveloper
Copy link
Member

Just trying to do housekeeping...

On Tue, 27 Sep 2016, 22:13 Ilya Dmitrichenko, errordeveloper@gmail.com
wrote:

Sure!

On Tue, 27 Sep 2016, 22:10 sebgoa, notifications@github.com wrote:

can you give me time to test again tomorrow.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#33544 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWSwfImH3weymFHmjQCpQTtvAkd-I6ks5quYZPgaJpZM4KHcRI
.

@DaspawnW
Copy link

Can't disable SELinux, it is not installed tells the ubuntu...

@luxas
Copy link
Member

luxas commented Sep 28, 2016

This only affects CentOS and Fedora, Ubuntu doesn't have this issue

On 28 Sep 2016, at 09:48, Björn Wenzel notifications@github.com wrote:

Can't

@DaspawnW
Copy link

won't work anymore on ubuntu. After using the updated manual on kubernetes.io the System hangs on the same point:

<master/tokens> generated token: "token"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready

But the apiserver log changed:

I0928 07:01:21.795743       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad24347: (2.145843ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:21.995791       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad4cf01: (2.148172ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.195801       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad74c69: (2.210056ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.395995       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad24347: (2.291263ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.595948       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad4cf01: (2.29452ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.795835       1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad74c69: (2.200798ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.995283       1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.638475ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.195307       1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.709752ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.395272       1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.650112ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.655487       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (807.398µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:23.658048       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.948487ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:23.710908       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (789.41µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:23.728159       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (3.595941ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:24.814084       1 handlers.go:162] GET /api/v1/nodes/ip-10-10-10-10: (1.120047ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:24.822929       1 handlers.go:162] PUT /api/v1/nodes/ip-10-10-10-10/status: (5.303192ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:25.660182       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (807.112µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:25.662714       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.910111ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:25.715300       1 handlers.go:162] GET /api/v1/nodes?resourceVersion=0: (437.108µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7/node-controller] 127.0.0.1:60550]
I0928 07:01:25.731729       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (803.973µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:25.734265       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (1.914164ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:27.664747       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (813.713µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:27.671610       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (2.416208ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:27.736170       1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (796.709µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:27.738650       1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (1.889742ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]

@errordeveloper
Copy link
Member

@DaspawnW thanks for reporting this. There is no SELinux on Ubuntu, but there is AppArmour (which is somewhat similar, but different). In any case, please open another issue. It'd help a lot if you could attach any extra information on how we could reproduce it, and perhaps output of docker ps as well as journalctl -u kubelet for the start.

@sebgoa
Copy link
Contributor Author

sebgoa commented Sep 28, 2016

Ok tested the new docs, you can close.

I wrote this Ansible playbook to automate a bit:
https://github.com/skippbox/kubeadm-centos

I will write a quick blog about it.

@luxas
Copy link
Member

luxas commented Sep 28, 2016

Closing as fixed with the newest packages

@luxas luxas closed this as completed Sep 28, 2016
@errordeveloper
Copy link
Member

@sebgoa excellent, thank you very much! Also, you could make a PR to docs with a link. Personally, I think that eventually it'd be great to have a set of simple multi-distro config management modules.

@Miyurz
Copy link

Miyurz commented Sep 29, 2016

@errordeveloper It doesn't work even after AppArmour is disabled/uninstalled on Ubuntu 14.04.
Check the log below.

> $ sudo kubeadm init 
> <master/tokens> generated token: <token>
> <master/pki> created keys and certificates in "/etc/kubernetes/pki"
> <util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
> <util/kubeconfig> created "/etc/kubernetes/admin.conf"
> <master/apiclient> created API client configuration
> <master/apiclient> created API client, waiting for the control plane to become ready

@errordeveloper
Copy link
Member

Sure, so this is probably a completely different problem. Could you file
another issue with more details, or jump on Slack and we will try to figure
it out together. Thanks!

On Thu, 29 Sep 2016, 06:43 Miyurz, notifications@github.com wrote:

@errordeveloper https://github.com/errordeveloper It doesn't work even
after AppArmour is disabled/uninstalled on Ubuntu 14.04.
Check the log below.

$ sudo kubeadm init
<master/tokens> generated token:
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#33544 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWS8skbL3t7DPxiaHwOeuLBwWmjAjKks5qu1AMgaJpZM4KHcRI
.

@Dmitry1987
Copy link
Contributor

Dmitry1987 commented Dec 16, 2016

@sebgoa can you please share which step solved the DNS pod issue for you? I get exactly the same error of "uninitialized" (kubeadm init finished ok with no errors, and I was able to join 1 node, then I see several pods started but this one can't)

" FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2056274620-3wc11_kube-system" with SetupNetworkError: "Failed to setup network for pod "kube-dns-2056274620-3wc11_kube-system(6539409a-c3ae-11e6-84f3-063d095554fd)" using network plugins "cni": cni config unintialized; Skipping pod"

I used this init command:
kubeadm init --use-kubernetes-version=v1.5.1 --cloud-provider=aws --service-dns-domain=newcluster.local --api-advertise-addresses=ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com--api-external-dns-names=ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com

I also tried to install the "network plugins" as kubeadm shows, did kubectl apply -f https://raw.githubusercontent.com/romana/romana/master/containerize/specs/romana-kubeadm.yml for example, but it doesn't help. Should I include some flag in kubeadm init for the default CNI network plugin to work? (kubenet or something?)
would be glad for any advice :) thanks!

@miry
Copy link
Contributor

miry commented Dec 18, 2016

@Dmitry1987 I use default weave network plugin: kubectl apply -f https://git.io/weave-kube It works for me.

It seems the documentation was changed, before this command by default http://kubernetes.io/docs/getting-started-guides/kubeadm/#installing-a-pod-network

@Dmitry1987
Copy link
Contributor

thanks @miry ! I also tried Weave and it worked :)

@saidiahd
Copy link

Hi Folk,
problem solved for me just by:
Stopping Apparmor :
# /etc/init.d/apparmor stop
after that, you shoud reset kubeadm
# kubeadm reset
and finally, rerun the Initialization of your master
# kubeadm init

@Blaiserman
Copy link

Same problem on CoreOS beta (1235.2.0). I used this to install:
https://github.com/luxas/kubeadm-installer

[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[preflight] Starting the kubelet service
[init] Using Kubernetes version: v1.5.1
[tokens] Generated token: "021fc9.50b5b7c476dab7f7"
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[apiclient] Created API client, waiting for the control plane to become ready

No return to docker ps -a | grep discovery
I chacked the selinux SELINUX=permissive and tried setenforce 0 too

sudo sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             mcs
Current mode:                   permissive
Mode from config file:          permissive
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      30

 kubectl get cs
Unable to connect to the server: dial tcp: lookup localhost on 8.8.8.8:53: no such host

SELinux: mount invalid. Same superblock, diffrent security settings for (dev mqueue, type  mqueue)

@kenzhaoyihui
Copy link

Some problems on centso7.
Also just block on this steps:
[apiclient] Created API client, waiting for the control plane to become ready.

docker ps -a
This is nothing after this command.

kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

getenforce
Permissive(selinux is disabled)

/var/log/messages

1 Dec 29 07:10:29 master kubelet: E1229 07:10:29.744234 8891 pod_workers.go:184] Error syncing pod b4b25cab578f82fd99198c566860faf7, skipping: failed to "StartContainer " for "POD" with ImagePullBackOff: "Back-off pulling image "gcr.io/google_containers/pause-amd64:3.0""
2 Dec 29 07:10:30 master kubelet: E1229 07:10:30.680786 8891 reflector.go:188] pkg/kubelet/config/apiserver.go:44: Failed to list *api.Pod: Get https://192.168.121.241: 6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
3 Dec 29 07:10:30 master kubelet: E1229 07:10:30.680797 8891 reflector.go:188] pkg/kubelet/kubelet.go:386: Failed to list *api.Node: Get https://192.168.121.241:6443/ap i/v1/nodes?fieldSelector=metadata.name%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
4 Dec 29 07:10:30 master kubelet: E1229 07:10:30.680835 8891 reflector.go:188] pkg/kubelet/kubelet.go:378: Failed to list *api.Service: Get https://192.168.121.241:6443 /api/v1/services?resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
5 Dec 29 07:10:31 master kubelet: I1229 07:10:31.144650 8891 kubelet_node_status.go:204] Setting node annotation to enable volume controller attach/detach
6 Dec 29 07:10:31 master kubelet: I1229 07:10:31.186977 8891 kubelet_node_status.go:74] Attempting to register node master
7 Dec 29 07:10:31 master kubelet: E1229 07:10:31.187254 8891 kubelet_node_status.go:98] Unable to register node "master" with API server: Post https://192.168.121.241:6 443/api/v1/nodes: dial tcp 192.168.121.241:6443: getsockopt: connection refused
8 Dec 29 07:10:31 master kubelet: I1229 07:10:31.397597 8891 kubelet_node_status.go:204] Setting node annotation to enable volume controller attach/detach
9 Dec 29 07:10:31 master kubelet: E1229 07:10:31.437996 8891 kubelet.go:1508] Failed creating a mirror pod for "kube-apiserver-master_kube-system(73c001656da6c2ae76abb7 d4879d2e36)": Post https://192.168.121.241:6443/api/v1/namespaces/kube-system/pods: dial tcp 192.168.121.241:6443: getsockopt: connection refused
10 Dec 29 07:10:31 master kubelet: E1229 07:10:31.681357 8891 reflector.go:188] pkg/kubelet/config/apiserver.go:44: Failed to list *api.Pod: Get https://192.168.121.241: 6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
11 Dec 29 07:10:31 master kubelet: E1229 07:10:31.681376 8891 reflector.go:188] pkg/kubelet/kubelet.go:386: Failed to list *api.Node: Get https://192.168.121.241:6443/ap i/v1/nodes?fieldSelector=metadata.name%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
12 Dec 29 07:10:31 master kubelet: E1229 07:10:31.681424 8891 reflector.go:188] pkg/kubelet/kubelet.go:378: Failed to list *api.Service: Get https://192.168.121.241:6443 /api/v1/services?resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
13 Dec 29 07:10:31 master docker-current: time="2016-12-29T07:10:31.738920860Z" level=error msg="Handler for GET /v1.22/images/gcr.io/google_containers/pause-amd64:3.0/json returned error: No such image: gcr.io/google_containers/pause-amd64:3.0"
14 Dec 29 07:10:31 master kubelet: E1229 07:10:31.739387 8891 docker_manager.go:2188] Failed to create pod infra container: ImagePullBackOff; Skipping pod "kube-apiserve r-master_kube-system(73c001656da6c2ae76abb7d4879d2e36)": Back-off pulling image "gcr.io/google_containers/pause-amd64:3.0"
15 Dec 29 07:10:31 master kubelet: E1229 07:10:31.739419 8891 pod_workers.go:184] Error syncing pod 73c001656da6c2ae76abb7d4879d2e36, skipping: failed to "StartContainer " for "POD" with ImagePullBackOff: "Back-off pulling image "gcr.io/google_containers/pause-amd64:3.0""
16 Dec 29 07:10:32 master kubelet: E1229 07:10:32.301850 8891 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
17 Dec 29 07:10:32 master kubelet: E1229 07:10:32.681920 8891 reflector.go:188] pkg/kubelet/kubelet.go:378: Failed to list *api.Service: Get https://192.168.121.241:6443 /api/v1/services?resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
18 Dec 29 07:10:32 master kubelet: E1229 07:10:32.681920 8891 reflector.go:188] pkg/kubelet/config/apiserver.go:44: Failed to list *api.Pod: Get https://192.168.121.241: 6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
19 Dec 29 07:10:32 master kubelet: E1229 07:10:32.681957 8891 reflector.go:188] pkg/kubelet/kubelet.go:386: Failed to list *api.Node: Get https://192.168.121.241:6443/ap i/v1/nodes?fieldSelector=metadata.name%3Dmaster&resourceVersion=0: dial tcp 192.168.121.241:6443: getsockopt: connection refused
20 Dec 29 07:10:32 master kubelet: E1229 07:10:32.954775 8891 eviction_manager.go:202] eviction manager: unexpected err: failed GetNode: node 'master' not found

Could someone give me some help?

@miry
Copy link
Contributor

miry commented Dec 29, 2016

@kenzhaoyihui you should be root to run such commands.

  1. Disable centos firewall:
systemctl stop firewalld
systemctl disable firewalld
  1. Run kubelet and docker
yum install -y docker-engine kubelet kubeadm kubectl kubernetes-cni
setenforce 0
systemctl start docker 
systemctl start kubelet
  1. kubeadm init ....
  2. Setup network
kubectl apply -f https://git.io/weave-kube
  1. Verify that it works
kubectl get po --all-namespaces

@kenzhaoyihui
Copy link

@miry yeah, i run such commands by root.Could you give me other advice?

@miry
Copy link
Contributor

miry commented Dec 29, 2016

@kenzhaoyihui and can you provide ip addr

@miry
Copy link
Contributor

miry commented Dec 29, 2016

@kenzhaoyihui you should be root to run such commands:

  1. Disable centos firewall:
systemctl stop firewalld
systemctl disable firewalld
  1. Run kubelet and docker
yum install -y docker-engine kubelet kubeadm kubectl kubernetes-cni
setenforce 0
systemctl start docker 
systemctl start kubelet
  1. kubeadm init .... copy the token from the output for kb nodes.
  2. Setup private network
kubectl apply -f https://git.io/weave-kube
  1. Verify that it works and DNS is ready:
kubectl get po --all-namespaces

@kenzhaoyihui
Copy link

@miry hey,miry
the output with cmd "ip addr" :
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
2 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3 inet 127.0.0.1/8 scope host lo
4 valid_lft forever preferred_lft forever
5 inet6 ::1/128 scope host
6 valid_lft forever preferred_lft forever
7 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
8 link/ether 52:54:00:80:49:e7 brd ff:ff:ff:ff:ff:ff
9 inet 192.168.121.171/24 brd 192.168.121.255 scope global dynamic eth0
10 valid_lft 544sec preferred_lft 544sec
11 inet6 fe80::5054:ff:fe80:49e7/64 scope link
12 valid_lft forever preferred_lft forever
13 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
14 link/ether 52:54:00:91:89:69 brd ff:ff:ff:ff:ff:ff
15 inet 192.168.200.2/24 brd 192.168.200.255 scope global eth1
16 valid_lft forever preferred_lft forever
17 inet6 fe80::5054:ff:fe91:8969/64 scope link
18 valid_lft forever preferred_lft forever
19 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
20 link/ether 02:42:57:92:52:06 brd ff:ff:ff:ff:ff:ff
21 inet 172.17.0.1/16 scope global docker0
22 valid_lft forever preferred_lft forever
23 5: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
24 link/ether 52:54:00:17:38:db brd ff:ff:ff:ff:ff:ff
25 inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
26 valid_lft forever preferred_lft forever
27 6: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 500
28 link/ether 52:54:00:17:38:db brd ff:ff:ff:ff:ff:ff

@kenzhaoyihui
Copy link

@miry I had tried to in a fresh env ,and disabled selinux, disabled firewalld, start kubelet,docker service,
and do that followed goole kubernetes guide , and the network weave pods crashed, kube-dns pods keep
creating status and no respones,.
So i want to know whether the kubeadm api affect the kubernetes installation or centos7 system have some issue conflict with kubeadm . I will have other try to do that and research kubeadm code about it.
Thanks for your help,miry!

@Dmitry1987
Copy link
Contributor

@kenzhaoyihui it might be also something with underlying infrastructure depends where you run it, like AWS / GCE / OpenStack / Bare metal...

from your log I see it's trying to connect to IP that you don't have on master (if you listed output of "ip addr" from master, where you try to init the kubeadm?),
it says everywhere "...dial tcp 192.168.121.241:6443: getsockopt: connection refused" but there's no "192.168.121.241" in your "ip addr" output. So kubeadm is somehow gets confused regarding your network environment I guess. It's worth to dig in this direction, understand why it can't connect to himself after creating its components.
Hope it helps.

@ozbillwang
Copy link

ozbillwang commented Jan 1, 2017

Got this issue as well today and seems this issue is still not fixed if enable selinux in centos 7

OS: centos 7.2.1511
Docker: 1.12.5
kubernet: 1.5.1

@kenzhaoyihui
Copy link

@Dmitry1987 Yeah,Thank for your help.I will check the log again.

@krishvoor
Copy link

krishvoor commented Jan 14, 2017

Facing the same issue as well ..
SELINUX is disabled.
OS : Ubuntu 1604
ARCH : ppc64le
iptables/firewalld : disabled
Natively compiled Kubernetes (release 1.5) and tried " kubeadm init ". Hangs here :-

[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: kubelet service does not exist
[init] Using Kubernetes version: v1.5.2
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[apiclient] Created API client, waiting for the control plane to become ready

@kenzhaoyihui
Copy link

@harsha544 Do you attach the log about /var/log/messages and docker images?

@krishvoor
Copy link

krishvoor commented Jan 14, 2017

@kenzhaoyihui Nothing in /var/log/syslog . Enclosing docker images output

docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

@kenzhaoyihui
Copy link

@harsha544 https://github.com/kenzhaoyihui/kubeadm-images-gcr.io/blob/master/pull_kubernetes_images.sh

The shell script is to pull the all docker images that needed, could you pull all the images and then execute "kubeadm init".

@krishvoor
Copy link

@kenzhaoyihui Thanks for the URL, I tweaked the respective script to pull ppc64le docker images, however not all docker images appears to be present.

@ozbillwang
Copy link

ozbillwang commented Jan 15, 2017

@harsha544

Be careful the link and script which @kenzhaoyihui provided, that script tries to fake the google's images with his own images.

You'd better to not run with it.

In fact the solution has been provided in this ticket, I fixed my issue already. It was provided by @benmathews commented on Sep 28, 2016. If yo missed that comment, you should take a try.

@krishvoor
Copy link

@SydOps I was cautious enough to pull docker images from gcr.io/google_containers/ppc64le..
Given that this is a ppc64le ARCH, I built the binaries from the GitHub Source, and hence I don't have the feasibility to restart via systemctl .. However I'm following the approach of Kubernetes Ansible (https://github.com/kubernetes/contrib/tree/master/ansible) To deploy the K8 cluster among my nodes.

@luxas
Copy link
Member

luxas commented Jan 17, 2017

@harsha544 Please open a new issue in kubernetes/kubeadm about this.
It's fully possible to solve it, but requires some manual hacking until we've got ppc64le mainline again, see: #38926

@mohamedbouchriha
Copy link

thanks @saidiahd It works for me

@shufanhao
Copy link
Contributor

I also hit this issues. have disabled SELINUX.

[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests