-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker setup fails due to etcd never getting started #19227
Comments
Thanks @borg286 -- would you be up for making this change in the docs and submitting a pull request? |
No. I felt that whoever made the last major redo of this guide should chime in with his input on the best way to do it. I don't know what is involved with getting the initial kubelet that spins up the other jobs to also spin up etcd and wait for it to be up first. The first version of getting a kubernetes cluster through docker split it out into 3 commands. These were put into one, and that goal may still be wanted. |
@fgrzadkowski and I have worked on the @borg286 Which version were you using when you ran the combined command from the master branch? It should work with |
@borg286 Confirmed that the command in https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/docker.md works out-of-the-box at least when using |
It seems that
Perhaps we should just updated the docs to use 1.1.3. Seems like a simple On Tue, Jan 5, 2016 at 8:07 AM, Lucas Käldström notifications@github.com
Integral z-squared dz |
I don't know how to make a pull request. This is as far as I got |
Well, I think it's working with But you must revert what you have done now and make a branch first and commit your changes to that branch and then send your PR |
So I just reinstalled my linux machine, installed ssh server, docker, then ran the mega-docker command to get kubernetes up making sure to use 1.1.3 and it doesn't start etcd. I hope you can help me figure out what was so different with your setup. Here are the failed docker containers (notice etcd isn't in that list), then the logs of the failed apiserver complaining, then [restful] 2016/01/10 06:18:46 log.go:30: [restful/swagger] https://10.1.10.12:6443borg@borg-1015E ~ $ docker ps -a Now here is where I run etcd as the old version instructed. borg@borg-1015E ~ $ docker run --restart=always -v /home/borg/etcd-data:/var/etcd/data --net=host -d gcr.io/google_containers/etcd:2.0.9 /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data borg@borg-1015E ~ $ docker ps Now everything runs fine. Please inspect what containers the mega-command starts up and ensure that etcd is included in that list. I don't know how gcr.io/google_containers/hyperkube:v${K8S_VERSION} is generated else I'd inspect the DOCKERFILE myself and check. |
First off, when you are posting logs, command line output, etc. Use this template: (I was forced to put ignore before the real formatting options. See more here IGNORE```bash code here IGNORE``` It improves readability a lot when you do this. Now it's really hard to read. Could you update your comment? You may see how it's going to look like if you click on the preview tab when you're editing the comment. To the issue
That issue was fixed with: 1c212b8 which was committed 8 Oct 2015. I would suggest you to pull Run this before the mega-command:
|
BTW, the source code for |
reposting logs as requested. restful] 2016/01/10 06:18:46 log.go:30: [restful/swagger] https://10.1.10.12:6443borg@borg-1015E ~ $ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4929641da7f3 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube apiserver" 29 seconds ago Exited (255) 28 seconds ago k8s_apiserver.9de8159f_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_163723e9
46a43d31908d gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube controlle" 29 seconds ago Exited (255) 18 seconds ago k8s_controller-manager.6994021e_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_f9199172
a90c7224b6bd gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube apiserver" 46 seconds ago Exited (255) 45 seconds ago k8s_apiserver.9de8159f_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_e376e352
ed60be876c83 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube scheduler" 46 seconds ago Up 45 seconds k8s_scheduler.ed57faf5_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_5a985de4
293407ada188 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube apiserver" 46 seconds ago Exited (255) 45 seconds ago k8s_apiserver.9de8159f_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_9085809a
ecf0b95ead5a gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube controlle" 47 seconds ago Exited (255) 36 seconds ago k8s_controller-manager.6994021e_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_c2ba89fa
1dae04ba616a gcr.io/google_containers/pause:0.8.0 "/pause" 47 seconds ago Up 46 seconds k8s_POD.6d00e006_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_69181cee
27b5ac1b8988 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube kubelet -" About a minute ago Up About a minute thirsty_mccarthy
bac9891deb4c hello-world "/hello" 2 minutes ago Exited (0) 2 minutes ago adoring_lumiere
borg@borg-1015E ~ $ docker logs 4929641da7f3
Flag --portal-net has been deprecated, see --service-cluster-ip-range instead.
Flag --address has been deprecated, see --insecure-bind-address instead
I0110 06:18:46.881849 1 plugins.go:71] No cloud provider specified.
I0110 06:18:46.882282 1 master.go:368] Node port range unspecified. Defaulting to 30000-32767.
I0110 06:18:46.882939 1 master.go:390] Will report 10.1.10.12 as public IP address.
E0110 06:18:46.916216 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Pod: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/pods?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.917007 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Endpoints: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/services/endpoints?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.921375 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Node: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/minions?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
[restful] 2016/01/10 06:18:46 log.go:30: [restful/swagger] listing is available at https://10.1.10.12:6443/swaggerapi/
E0110 06:18:46.947404 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Endpoints: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/services/endpoints?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.948212 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Node: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/minions?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.948492 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Pod: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/pods?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
[restful] 2016/01/10 06:18:46 log.go:30: [restful/swagger] https://10.1.10.12:6443/swaggerui/ is mapped to folder /swagger-ui/
E0110 06:18:46.973921 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Endpoints: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/services/endpoints?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.975956 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Pod: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/pods?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
E0110 06:18:46.976205 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Node: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/minions?quorum=false&recursive=true&sorted=true: dial tcp 127.0.0.1:4001: connection refused]) [0]
F0110 06:18:46.986612 1 controller.go:80] Unable to perform initial IP allocation check: unable to refresh the service IP block: 501: All the given peers are not reachable (failed to propose on members [http://127.0.0.1:4001] twice [last error: Get http://127.0.0.1:4001/v2/keys/registry/ranges/serviceips?quorum=false&recursive=false&sorted=false: dial tcp 127.0.0.1:4001: connection refused]) [0]
Now here is where I run etcd as the old version instructed.
borg@borg-1015E ~ $ docker run --restart=always -v /home/borg/etcd-data:/var/etcd/data --net=host -d gcr.io/google_containers/etcd:2.0.9 /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data
Unable to find image 'gcr.io/google_containers/etcd:2.0.9' locally
Pulling repository gcr.io/google_containers/etcd
b6b9a86dc06a: Download complete
511136ea3c5a: Download complete
4ec7f790b564: Download complete
cfeffad3cf16: Download complete
Status: Downloaded newer image for gcr.io/google_containers/etcd:2.0.9
gcr.io/google_containers/etcd: this image was pulled from a legacy registry. Important: This registry version will not be supported in future versions of docker.
4992f751958775574d4c2c57f665f828ec052107295f500bb44b7189e44d412f
borg@borg-1015E ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
782a072ae6cc gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube controlle" 3 minutes ago Up 3 minutes k8s_controller-manager.6994021e_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_6d5bd038
8740099ecf48 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube apiserver" 4 minutes ago Up 4 minutes k8s_apiserver.9de8159f_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_449bcb6b
4992f7519587 gcr.io/google_containers/etcd:2.0.9 "/usr/local/bin/etcd " 4 minutes ago Up 4 minutes fervent_leakey
ed60be876c83 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube scheduler" 7 minutes ago Up 7 minutes k8s_scheduler.ed57faf5_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_5a985de4
1dae04ba616a gcr.io/google_containers/pause:0.8.0 "/pause" 7 minutes ago Up 7 minutes k8s_POD.6d00e006_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_69181cee
27b5ac1b8988 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube kubelet -" 7 minutes ago Up 7 minutes thirsty_mccarthy
borg@borg-1015E ~ $ kubectl version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
|
Please help me figure out what I'm missing. I would have thought that 10 minutes would have been enough time. Obviously I'm missing some parameter because the apiserver never comes up for good. borg@borg-1015E ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
borg@borg-1015E ~ $ export K8S_VERSION=1.1.3
borg@borg-1015E ~ $ docker run \
> --volume=/:/rootfs:ro \
> --volume=/sys:/sys:ro \
--hostname-override="127.0.0.1" \
--address="0.0.0.0" \
--api-servers=http://localhost:8080 \
--config=/etc/kubernetes/manifests \
--cluster-dns=10.0.0.10 \
--cluster-domain=cluster.local \
> --volume=/dev:/dev \
> --volume=/var/lib/docker/:/var/lib/docker:rw \
> --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
> --volume=/var/run:/var/run:rw \
> --net=host \
> --pid=host \
> --privileged=true \
> -d \
> gcr.io/google_containers/hyperkube:v${K8S_VERSION} \
> /hyperkube kubelet \
> --containerized \
> --hostname-override="127.0.0.1" \
> --address="0.0.0.0" \
> --api-servers=http://localhost:8080 \
> --config=/etc/kubernetes/manifests \
> --cluster-dns=10.0.0.10 \
> --cluster-domain=cluster.local \
> --allow-privileged=true --v=10
Unable to find image 'gcr.io/google_containers/hyperkube:v1.1.3' locally
v1.1.3: Pulling from google_containers/hyperkube
Digest: sha256:004dde049951a4004d99e12846e1fc7274fdc5855752d50288e3be4748778ca2
Status: Downloaded newer image for gcr.io/google_containers/hyperkube:v1.1.3
1425512b60dfb02bc12dd1bdc9e0611abe49b8c57e34d41b5aa540fe99582921
borg@borg-1015E ~ $ date
Sun Jan 10 08:01:03 PST 2016
borg@borg-1015E ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7d4893033b48 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube scheduler" 15 seconds ago Up 15 seconds k8s_scheduler.ed57faf5_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_b644973e
dff20fe16c1e gcr.io/google_containers/pause:0.8.0 "/pause" 17 seconds ago Up 16 seconds k8s_POD.6d00e006_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_ffb4fb26
1425512b60df gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube kubelet -" 22 seconds ago Up 22 seconds determined_hoover
borg@borg-1015E ~ $ date
Sun Jan 10 08:05:29 PST 2016
borg@borg-1015E ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7d4893033b48 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube scheduler" 4 minutes ago Up 4 minutes k8s_scheduler.ed57faf5_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_b644973e
dff20fe16c1e gcr.io/google_containers/pause:0.8.0 "/pause" 4 minutes ago Up 4 minutes k8s_POD.6d00e006_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_ffb4fb26
1425512b60df gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube kubelet -" 4 minutes ago Up 4 minutes determined_hoover
borg@borg-1015E ~ $ date
Sun Jan 10 08:10:52 PST 2016
borg@borg-1015E ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7d4893033b48 gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube scheduler" 9 minutes ago Up 9 minutes k8s_scheduler.ed57faf5_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_b644973e
dff20fe16c1e gcr.io/google_containers/pause:0.8.0 "/pause" 9 minutes ago Up 9 minutes k8s_POD.6d00e006_k8s-master-127.0.0.1_default_e1376f76a07b85e8b0e4c363ff0fa6c1_ffb4fb26
1425512b60df gcr.io/google_containers/hyperkube:v1.1.3 "/hyperkube kubelet -" 10 minutes ago Up 10 minutes determined_hoover
borg@borg-1015E ~ $ |
I think the problem may be that the etcd image isn't pushed I don't have permission to pubish this image as this Makefile would do https://github.com/kubernetes/kubernetes/blob/master/cluster/images/etcd/Makefile |
No the image is: |
You're using instructions from HEAD, but the image you're using doesn't start etcd. That's all working as intended. I guess we should just update the doc. |
Oh gosh, I used a self-built @fgrzadkowski Since there is a |
@fgrzadkowski And BTW, could #17213 be merged before you do that? Otherwise we'll have kind of the same problems again (with no |
I'm on paternity leave until 24th of Jan. I won't be able to work on this Maybe Isaac can help? -- sent from a mobile device
|
@ihmccreery Could you chime in and merge or help to merge #17213 and then push the |
Any chance we could simply update 1.1.3 with an image that does spin up etcd while we wait for #19623 to get fixed? |
I do not have push permission, so I can't... |
We can't merge #17213 and then retroactively call it part of v1.1.4. If it gets in before v1.1.5, it'll take effect then. Sorry. |
Hm. Also, #17213 is against |
OK, got it. It's fully logical that you use the code that's in the Well, the But But the easiest way to solve this, is to update the hyperkube version to |
Ref #16087 |
And #18230 |
@borg286 I think it should work from HEAD now. Maybe you want to try. These instructions on HEAD requires |
It is fixed. |
I tried to follow the instructions on https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/docker.md
but the everything but the scheduler, kubelet, and /pause fail. I check the logs and it seems the apiserver failed because etcd was unavailable.
I went to an older version of these instructions to get a docker command to bring up etcd and then the mega-command above and then it worked.
I suggest either having a seperate command to get etcd up and running, or having the mega-command do that like it does for the apiserver and family.
The text was updated successfully, but these errors were encountered: