Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes stuck in starting #3649

Closed
2 tasks done
donfanning opened this issue May 2, 2019 · 15 comments
Closed
2 tasks done

Kubernetes stuck in starting #3649

donfanning opened this issue May 2, 2019 · 15 comments

Comments

@donfanning
Copy link

  • I have tried with the latest version of my channel (Stable or Edge): stable
  • I have uploaded Diagnostics
  • Diagnostics ID: 11AC44E7-EABF-4581-9758-4781E01B5BDD/20190501234826

Expected behavior

  • Kubernetes in a running state.

Actual behavior

  • Kubernetes stuck in a starting state.

Information

  • macOS Version: 10.13.6 - High Sierra

Diagnostic logs

Docker for Mac: version... - 2.0.0.3 Stable 8858db33c8

Steps to reproduce the behavior

See diagnostic logs.

@nicovillanueva
Copy link

nicovillanueva commented May 3, 2019

I'm facing possibly the same issue, and not to open yet another one, I'll just piggyback on this one.
Updating to 2.0.4.0 did not help. My Diagnostics ID is: CFFDB6A8-0D6B-4D37-8E21-EB408E348132/20190503141524

What I did find out was that the container kube-controller-manager was complaining about the SSL certs (container visible after enabling "Show system containers (advanced)":

E0503 15:28:59.161989 1 leaderelection.go:306] error retrieving resource lock kube-system/kube-controller-manager: Get https://vm.docker.internal:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: x509: certificate is valid for docker-for-desktop, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, host.docker.internal, not vm.docker.internal

I found this other issue: docker/for-win#3799 which mentions that exact error. It seems that by deleting the PKI folder fixes the issue.
The analogous folder in Mac is: ~/Library/Group Containers/group.com.docker/pki
rm it (or mv it), restart the Docker engine, and Kubernetes comes up just fine.

If any maintainer could roughly point me to where the issue may lie, I could take a stab at fixing it.

@alanmpitts
Copy link

I am having a similar problem, and would like to piggy back on this issue.
diagID: AD501AE5-F2B9-4B50-8472-37C39D3963FA/20190506214721

running Docker-4-Mac 2.0.4.0 (33772) on MacOS 10.13.6

I notice the following in the docker events log
2019-05-06T17:45:59.326064326-04:00 container exec_die 4997b539fced265bfb1fb664d05acfe13b4135a4741a20920dd43d07e4be9cc6 (annotation.io.kubernetes.container.hash=625b6933, annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, execID=3914dadd4e42bcfdb18fceed6a91923d66f0ab3200f93cd3449343262aadd91a, exitCode=0, image=sha256:2c4adeb21b4ff8ed3309d0e42b6b4ae39872399f7b37e0856e673b13c4aba13d, io.kubernetes.container.logpath=/var/log/pods/kube-system_etcd-docker-desktop_3773efb8e009876ddfa2c10173dba95e/etcd/0.log, io.kubernetes.container.name=etcd, io.kubernetes.docker.type=container, io.kubernetes.pod.name=etcd-docker-desktop, io.kubernetes.pod.namespace=kube-system, io.kubernetes.pod.uid=3773efb8e009876ddfa2c10173dba95e, io.kubernetes.sandbox.id=8c1ff3fb3bd47d18fcdad9148af943809dedd85a50ff6ab269f359d18fc90d62, name=k8s_etcd_etcd-docker-desktop_kube-system_3773efb8e009876ddfa2c10173dba95e_0)

@mikeparker
Copy link
Contributor

mikeparker commented May 30, 2019

Possibly this is the same as docker/for-win#3799
An initial investigation suggests this could be an installer bug when you install a new version over the top of an existing one and the certs don't get updated. Uninstalling deletes the folder so a clean uninstall / reinstall should work fine.

@nicovillanueva
Copy link

@mikeparker Yep, it's the same issue. Deleting the PKI folder (~/Library/Group Containers/group.com.docker/pki) and restarting Docker works around the issue.
I'd love to help out, but I'm anything but familiar with the Docker codebase. Have any idea where the issue may be?

@mikeparker
Copy link
Contributor

@nicovillanueva sorry the codebase is closed source but appreciate the offer!

@docker-robott
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@nicklaros
Copy link

same here. works after deleting ~/Library/Group Containers/group.com.docker/pki directory as mentioned by @nicovillanueva above

@Issam-Ahmad
Copy link

@nicovillanueva Your solution worked for me as well
Cheers!

@ph61706c6e
Copy link

Also worked for me on Docker Desktop 2.2.0.4

@n1t1nv3rma
Copy link

n1t1nv3rma commented Apr 15, 2020

Docker Desktop 2.2.0.5

It worked for me too after renaming the ~/Library/Group Containers/group.com.docker/pki directory.

It seems one can also disable the cluster via /Users//Library/Group\ Containers/group.com.docker/settings.json file.

@benweizhu
Copy link

Works for me. Thanks

@nitishcse412
Copy link

This works for me as well. Thanks much

@lucianrica
Copy link

If anyone facing this issue in may 2020, fixed by editing windows environment variables

  • Windows 10
  • Docker Desktop
  • Docker Kubernetes
  • Proxy
  1. Windows Search -> edit environment var...
  2. Environment Variables -> Variable -> no_proxy (create one if you don't have)
  3. Click -> Edit -> add .docker.internal
  4. Restart Docker

5.Not sure if required but i also have under no_proxy var: .company.com, localhost, 127.0.0.1

@luudis
Copy link

luudis commented Jun 1, 2020

This helped me!
https://github.com/AliyunContainerService/k8s-for-docker-desktop/
Do as above link said, if not work, remove the '~/Library/Group\ Containers/group.com.docker/pki' direcotry and restart docker desktop. wait like 5 minutes, i finally found the kubernetes green in running status!

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jul 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests