-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubeadm blocks waiting for 'control plane' #33544
Comments
Same issue for me on an aws installation, export https_proxy=http://<proxy>:<port>
export http_proxy=http://<proxy>:<port>
kubeadm init --cloud-provider aws Looked at the logs of the apiServer. It returns with an exception: I0927 11:44:47.425374 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (793.43µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:47.427858 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.682203ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:47.606685 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57328: remote error: bad certificate
I0927 11:44:47.722809 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57330: remote error: bad certificate
I0927 11:44:47.728099 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57332: remote error: bad certificate
I0927 11:44:48.251368 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57334: remote error: bad certificate
I0927 11:44:48.256871 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57336: remote error: bad certificate
I0927 11:44:48.262479 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57338: remote error: bad certificate
I0927 11:44:48.267460 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57340: remote error: bad certificate
I0927 11:44:48.608406 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57342: remote error: bad certificate
I0927 11:44:48.724428 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57344: remote error: bad certificate
I0927 11:44:48.729680 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57346: remote error: bad certificate
I0927 11:44:48.777612 1 handlers.go:162] GET /healthz: (39.187µs) 200 [[Go-http-client/1.1] 127.0.0.1:49808]
I0927 11:44:49.429761 1 handlers.go:162] G10.10.10.10ET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (762.498µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:49.432267 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (2.070905ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:49.614084 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57354: remote error: bad certificate
I0927 11:44:49.727405 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57356: remote error: bad certificate
I0927 11:44:49.732888 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57358: remote error: bad certificate
I0927 11:44:50.080279 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57360: remote error: bad certificate
I0927 11:44:50.085570 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57362: remote error: bad certificate
I0927 11:44:50.617384 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57364: remote error: bad certificate
I0927 11:44:50.730144 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57366: remote error: bad certificate
I0927 11:44:50.735525 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57368: remote error: bad certificate
I0927 11:44:51.433824 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (769.066µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:51.436359 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.713977ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:51.620964 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57370: remote error: bad certificate
I0927 11:44:51.731724 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57372: remote error: bad certificate
I0927 11:44:51.761983 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57374: remote error: bad certificate
I0927 11:44:52.622487 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57376: remote error: bad certificate
I0927 11:44:52.732927 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57378: remote error: bad certificate
I0927 11:44:52.762908 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57380: remote error: bad certificate
I0927 11:44:53.438270 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (805.346µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:53.440909 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.82773ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:46848]
I0927 11:44:53.627293 1 logs.go:41] http: TLS handshake error from 10.10.10.10:57382: remote error: bad certificate
|
@RustyRobot , disabling SELINUX on Ubuntu 16.04 does solve the problem of hanging. |
Ok disabling selinux got me further, the What component is supposed to be listening on that port for cluster joins ? |
ok so the discover port is using a hostPort on 9898. logs on that pod return this:
I am following the docs The DNS pod is not starting:
|
@errordeveloper looks like this might be right up your alley.. |
@sebgoa can you try starting from scratch, following the instructions at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/ please? |
@lukemarsden, I followed the instructions you posted, and it seems that systemd is immediately starting kubelet:
|
@oz123 If you already did |
There is a uninstall script referenced at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/. After running it, my init ran correctly again.
|
I think this can be closed now, as soon as new packages become available. On Tue, 27 Sep 2016, 18:33 Ben Mathews, notifications@github.com wrote:
|
can you give me time to test again tomorrow. |
Sure! On Tue, 27 Sep 2016, 22:10 sebgoa, notifications@github.com wrote:
|
Just trying to do housekeeping... On Tue, 27 Sep 2016, 22:13 Ilya Dmitrichenko, errordeveloper@gmail.com
|
Can't disable SELinux, it is not installed tells the ubuntu... |
This only affects CentOS and Fedora, Ubuntu doesn't have this issue
|
won't work anymore on ubuntu. After using the updated manual on kubernetes.io the System hangs on the same point: <master/tokens> generated token: "token"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready But the apiserver log changed: I0928 07:01:21.795743 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad24347: (2.145843ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:21.995791 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad4cf01: (2.148172ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.195801 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad74c69: (2.210056ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.395995 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad24347: (2.291263ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.595948 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad4cf01: (2.29452ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.795835 1 handlers.go:162] PATCH /api/v1/namespaces/default/events/ip-10-10-10-10.14786a8fdad74c69: (2.200798ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:22.995283 1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.638475ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.195307 1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.709752ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.395272 1 handlers.go:162] POST /api/v1/namespaces/kube-system/events: (1.650112ms) 201 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:23.655487 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (807.398µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:23.658048 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.948487ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:23.710908 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (789.41µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:23.728159 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (3.595941ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:24.814084 1 handlers.go:162] GET /api/v1/nodes/ip-10-10-10-10: (1.120047ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:24.822929 1 handlers.go:162] PUT /api/v1/nodes/ip-10-10-10-10/status: (5.303192ms) 200 [[kubelet/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 10.10.10.10:57954]
I0928 07:01:25.660182 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (807.112µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:25.662714 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.910111ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:25.715300 1 handlers.go:162] GET /api/v1/nodes?resourceVersion=0: (437.108µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7/node-controller] 127.0.0.1:60550]
I0928 07:01:25.731729 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (803.973µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:25.734265 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (1.914164ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:27.664747 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (813.713µs) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:27.671610 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (2.416208ms) 200 [[kube-scheduler/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60274]
I0928 07:01:27.736170 1 handlers.go:162] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (796.709µs) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550]
I0928 07:01:27.738650 1 handlers.go:162] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (1.889742ms) 200 [[kube-controller-manager/v1.4.0 (linux/amd64) kubernetes/a16c0a7] 127.0.0.1:60550] |
@DaspawnW thanks for reporting this. There is no SELinux on Ubuntu, but there is AppArmour (which is somewhat similar, but different). In any case, please open another issue. It'd help a lot if you could attach any extra information on how we could reproduce it, and perhaps output of |
Ok tested the new docs, you can close. I wrote this Ansible playbook to automate a bit: I will write a quick blog about it. |
Closing as fixed with the newest packages |
@sebgoa excellent, thank you very much! Also, you could make a PR to docs with a link. Personally, I think that eventually it'd be great to have a set of simple multi-distro config management modules. |
@errordeveloper It doesn't work even after AppArmour is disabled/uninstalled on Ubuntu 14.04.
|
Sure, so this is probably a completely different problem. Could you file On Thu, 29 Sep 2016, 06:43 Miyurz, notifications@github.com wrote:
|
@sebgoa can you please share which step solved the DNS pod issue for you? I get exactly the same error of "uninitialized" (kubeadm init finished ok with no errors, and I was able to join 1 node, then I see several pods started but this one can't) " FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2056274620-3wc11_kube-system" with SetupNetworkError: "Failed to setup network for pod "kube-dns-2056274620-3wc11_kube-system(6539409a-c3ae-11e6-84f3-063d095554fd)" using network plugins "cni": cni config unintialized; Skipping pod" I used this init command: I also tried to install the "network plugins" as kubeadm shows, did |
@Dmitry1987 I use default weave network plugin: It seems the documentation was changed, before this command by default http://kubernetes.io/docs/getting-started-guides/kubeadm/#installing-a-pod-network |
thanks @miry ! I also tried Weave and it worked :) |
Hi Folk, |
Same problem on CoreOS beta (1235.2.0). I used this to install:
No return to
|
Some problems on centso7.
1 Dec 29 07:10:29 master kubelet: E1229 07:10:29.744234 8891 pod_workers.go:184] Error syncing pod b4b25cab578f82fd99198c566860faf7, skipping: failed to "StartContainer " for "POD" with ImagePullBackOff: "Back-off pulling image "gcr.io/google_containers/pause-amd64:3.0""
|
@kenzhaoyihui you should be root to run such commands.
|
@miry yeah, i run such commands by root.Could you give me other advice? |
@kenzhaoyihui and can you provide |
@kenzhaoyihui you should be root to run such commands:
|
@miry hey,miry |
@miry I had tried to in a fresh env ,and disabled selinux, disabled firewalld, start kubelet,docker service, |
@kenzhaoyihui it might be also something with underlying infrastructure depends where you run it, like AWS / GCE / OpenStack / Bare metal... from your log I see it's trying to connect to IP that you don't have on master (if you listed output of "ip addr" from master, where you try to init the kubeadm?), |
Got this issue as well today and seems this issue is still not fixed if enable selinux in centos 7
|
@Dmitry1987 Yeah,Thank for your help.I will check the log again. |
Facing the same issue as well .. [kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters. |
@harsha544 Do you attach the log about /var/log/messages and docker images? |
@kenzhaoyihui Nothing in /var/log/syslog . Enclosing docker images output docker images |
@harsha544 https://github.com/kenzhaoyihui/kubeadm-images-gcr.io/blob/master/pull_kubernetes_images.sh The shell script is to pull the all docker images that needed, could you pull all the images and then execute "kubeadm init". |
@kenzhaoyihui Thanks for the URL, I tweaked the respective script to pull ppc64le docker images, however not all docker images appears to be present. |
Be careful the link and script which @kenzhaoyihui provided, that script tries to fake the google's images with his own images. You'd better to not run with it. In fact the solution has been provided in this ticket, I fixed my issue already. It was provided by @benmathews commented on Sep 28, 2016. If yo missed that comment, you should take a try. |
@SydOps I was cautious enough to pull docker images from gcr.io/google_containers/ppc64le.. |
@harsha544 Please open a new issue in kubernetes/kubeadm about this. |
thanks @saidiahd It works for me |
I also hit this issues. have disabled SELINUX. |
Hi @kubernetes/sig-cluster-lifecycle
I tried to follow the docs for kubeadm on centOS 7.1.
It seems that the
kubeadm init
blocks waiting for 'control plane to become ready' even though all containers are running.here are the running containers on the same master machine:
I tried to join a node but I get a connection refused error, even though there is no firewall...
and now I am actually wondering if the
init
is blocking waiting for nodes to join. According to the docs it is not blocking, but the logs ofkubeadm
seems to indicate that it is.The text was updated successfully, but these errors were encountered: