Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Kubernetes is starting…" state never ends #2990

Open
plambert opened this issue Jun 11, 2018 · 90 comments

Comments

@plambert
Copy link

commented Jun 11, 2018

  • [ x] I have tried with the latest version of my channel (Stable or Edge)
  • [ x] I have uploaded Diagnostics
  • Diagnostics ID: 804B8977-06D2-4E2A-BB3E-10FBBA99D1F9/20180611-105749

Expected behavior

Within a few minutes of starting Docker for Mac, Kubernetes should be available.

Actual behavior

After several hours, it is still 'starting…'

Information

  • macOS Version: 10.13.5

Diagnostic logs

(Uploaded as 804B8977-06D2-4E2A-BB3E-10FBBA99D1F9/20180611-105749)

Steps to reproduce the behavior

  1. I've quit Docker, rebooted, and started it again, with the same result
@yuzumikan15

This comment has been minimized.

Copy link

commented Jun 12, 2018

I have the same problem when update to the version 18.05.0-ce-mac67 (25042).

  • I can use kubectl commands even though Docker for mac is showing Kubernetes is starting....
  • I cannot stop Kubernetes from Preferences... -> Kubernetes.
  • External IP is not attached to LB (only showing <pending>) though I have waited for 20 min.
    • Previous version (18.05.0-ce-mac66) attached localhost as External IP to LB
    • This version only shows <pending> and cURL command like curl http://localhost/some-url returns error: curl: (7) Failed to connect to localhost port 80: Connection refused even though I could use this on the previous version.
$ kgs
NAME                                            TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
crazy-greyhound-nginx-ingress-controller        LoadBalancer   10.96.142.184    <pending>     80:31157/TCP,443:30351/TCP   27m
crazy-greyhound-nginx-ingress-default-backend   ClusterIP      10.103.196.198   <none>        80/TCP                       27m
...

Diagnostic logs: 2E0B1819-92A4-408C-9548-B52A29AAF164/20180612-214950

@norbertmocsnik

This comment has been minimized.

Copy link

commented Jun 13, 2018

Experiencing the same after upgrading to macOS 10.13.5. Also see #2985

@tuzla0autopilot4

This comment has been minimized.

Copy link

commented Jun 13, 2018

I had exactly the same scenario and issue-- even rebooting didn't help. Simply doing the "Reset Kubernetes cluster" on the Reset tab resolved the issue for me. (I didn't have to go so far as to reset to factory defaults.)

@norbertmocsnik

This comment has been minimized.

Copy link

commented Jun 13, 2018

I didn't realize Reset was a tab! I thought it's a button that would reset everything at once (not just Kubernetes). Resetting the Kubernetes cluster has helped indeed, although I do hope that this is just a one time thing. I was happy to move from minikube to Docker for Mac primarily because minikube releases broke the Kubernetes cluster often. Hopefully this will not be the case with Docker for Mac in the future.

@yuzumikan15

This comment has been minimized.

Copy link

commented Jun 14, 2018

"Reset Kubernetes cluster" also worked for my case! Thank you so much 😆

@tjamet

This comment has been minimized.

Copy link

commented Jun 19, 2018

the same problem here, kubernetes stuck in starting state
screen shot 2018-06-19 at 10 45 06

diagnostic ID 8A8F9D98-E405-4AB4-ABF8-B52AB26DBD61/20180619-104118

Running on OSX version 10.13.4 (17E202)

Docker for Mac: version: 18.05.0-ce-mac67 (1fa4e2acfc1a52f79623add2390604515d32297e)
macOS: version 10.13.4 (build: 17E202)
logs: /tmp/8A8F9D98-E405-4AB4-ABF8-B52AB26DBD61/20180619-104118.tar.gz
[OK]     vpnkit
[OK]     virtualization hypervisor
[OK]     vmnetd
[OK]     dns
[OK]     driver.amd64-linux
[OK]     virtualization VT-X
[OK]     app
[OK]     moby
[OK]     system
[OK]     moby-syslog
[OK]     kubernetes
[OK]     files
[OK]     env
[OK]     virtualization kern.hv_support
[OK]     osxfs
[OK]     moby-console
[OK]     logs
[OK]     docker-cli
[OK]     disk
@djs55

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2018

@plambert your logs have:

2018-06-11 10:57:46.097206-0700  localhost com.docker.driver.amd64-linux[10740]: Node is not ready: PIDPressure/False kubelet has sufficient PID available
2018-06-11 10:57:47.102135-0700  localhost com.docker.driver.amd64-linux[10740]: Node is not ready: PIDPressure/False kubelet has sufficient PID available
2018-06-11 10:57:48.097550-0700  localhost com.docker.driver.amd64-linux[10740]: Node is not ready: PIDPressure/False kubelet has sufficient PID available
2018-06-11 10:57:49.097725-0700  localhost com.docker.driver.amd64-linux[10740]: Node is not ready: PIDPressure/False kubelet has sufficient PID available
2018-06-11 10:57:50.102071-0700  localhost com.docker.driver.amd64-linux[10740]: Node is not ready: PIDPressure/False kubelet has sufficient PID available

and yet

$ /usr/local/bin/kubectl.docker  --context docker-for-desktop get nodes
NAME                 STATUS    ROLES     AGE       VERSION
docker-for-desktop   Ready     master    7d        v1.10.3

and

$ /usr/local/bin/kubectl.docker  --context docker-for-desktop describe nodes
Name:               docker-for-desktop
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=docker-for-desktop
                    node-role.kubernetes.io/master=
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Mon, 04 Jun 2018 09:36:58 -0700
Taints:             <none>
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Mon, 11 Jun 2018 10:58:31 -0700   Mon, 04 Jun 2018 09:36:44 -0700   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Mon, 11 Jun 2018 10:58:31 -0700   Mon, 04 Jun 2018 09:36:44 -0700   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 11 Jun 2018 10:58:31 -0700   Mon, 04 Jun 2018 09:36:44 -0700   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Mon, 11 Jun 2018 10:58:31 -0700   Mon, 04 Jun 2018 09:36:44 -0700   KubeletReady                 kubelet is posting ready status
  PIDPressure      False   Mon, 11 Jun 2018 10:58:31 -0700   Thu, 07 Jun 2018 22:30:59 -0700   KubeletHasSufficientPID      kubelet has sufficient PID available

I notice that PIDPressure has Status False and yet is mentioned in the main Docker log. I'll check the code which waits for the node ready state to see if it is confused.

@djs55

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2018

I've fixed a bug in the code which failed to notice the node had become Ready. The fix is in the latest development build which can be downloaded form here: https://download-stage.docker.com/mac/bysha1/98b468326d2c579b87b48c47b2ac7e66c1f0a282/Docker.dmg Note that this build is only suitable for testing -- not production. If you get a chance to try it, let me know how it goes. If it still fails, please upload a fresh set of diagnostics.

@djs55 djs55 self-assigned this Jun 20, 2018

@djs55 djs55 referenced this issue Jun 20, 2018

Closed

block in `Kubernetes is starting` #3019

1 of 2 tasks complete
@tjamet

This comment has been minimized.

Copy link

commented Jun 20, 2018

Hi @djs55 I have tried with my previous Docker.qcow2 image and started your version, I now have the kubernetes is running back! Thanks!
hope it works for the others as well :)

@djs55

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2018

@tjamet thanks a lot for the speedy confirmation!

@bitmvr

This comment has been minimized.

Copy link

commented Jul 11, 2018

For what it is worth, Kubernetes is now running successfully without your patch @djs55 on version 18.05.0-ce-mac67 (25042)

I have edge installed via Homebrew. Here is what I did step-by-step:

  1. Completely removed docker
  2. Uninstalled it via homebrew
  3. Reinstalled it via homebrew

Not sure why this works and how it doesn't 'require' your patch, but it something others might try if they need a 'production' build.

@markhilton

This comment has been minimized.

Copy link

commented Jul 26, 2018

@djs55 I still have "Kubernetes is starting..." running your dmg build.
Diagnostic ID: 29A56A58-476F-4D93-B4E0-B518C213C244/20180726-012938

@arafatmohammed

This comment has been minimized.

Copy link

commented Aug 17, 2018

I had the same issue, found out my /etc/hosts was empty. Once I restored it original values, Kubernetes is now starting successfully.

@cicorias

This comment has been minimized.

Copy link

commented Aug 29, 2018

what worked for me is a reset of ALL data, then recreate it all.

@MartinEmrich

This comment has been minimized.

Copy link

commented Sep 3, 2018

I also have this issue. Kubernetes worked fine one time, but after I had to reboot, it got stuck in "Kubernetes is starting".

I tried these steps:

  • Reboot again
  • Reset Kubernetes
  • Reset Disk
  • Reset to factory defaults
  • Uninstall/Reinstall
  • Uninstall/Install edge release
  • Uninstall/Reinstall edge with manually removing stuff smelling like docker/kubernetes from my home directory

But it still won't start.

$ kubectl cluster-info 
Kubernetes master is running at https://localhost:6443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: EOF
$ kubectl cluster-info dump
Unable to connect to the server: EOF
@avocade

This comment has been minimized.

Copy link

commented Sep 11, 2018

Yep, getting the same thing. Latest release for macOS (stable). Will test edge too.

@avocade

This comment has been minimized.

Copy link

commented Sep 17, 2018

Failing on edge too, tried all means of getting it running (yes I uninstalled homebrew too). Moving back to minikube for now.

@aditzel

This comment has been minimized.

Copy link

commented Sep 20, 2018

Same as above: stuck in starting after updating to latest macOS release on both stable and edge channels.

@keshavgupt

This comment has been minimized.

Copy link

commented Sep 20, 2018

I had it running fine on my mac but but after upgrading my OS to "High Sierra 10.13.6", running into same issue. Have already tried a few different stable and edge versions.

@thgsn

This comment has been minimized.

Copy link

commented Sep 23, 2018

Same here, testing both versions stable/edge, OS HS 10.13.6

@rdrgmnzs

This comment has been minimized.

Copy link

commented Sep 24, 2018

Same here as well, at first with the latest stable and now with Edge Version 2.0.0.0-beta1-mac75 (27117) and High Sierra 10.13.6.

@davidkarlsen

This comment has been minimized.

Copy link

commented Sep 27, 2018

same with macos 10.14 and 18.06.1-ce-mac73 the problem seems to be with etcd:

{"log":"2018-09-27 09:54:08.188030 I | embed: peerTLS: cert = /run/config/pki/etcd/peer.crt, key = /run/config/pki/etcd/peer.key, ca = , trusted-ca = /run/config/pki/etcd/ca.crt, client-cer
t-auth = true\n","stream":"stderr","time":"2018-09-27T09:54:08.188456442Z"}
{"log":"2018-09-27 09:54:08.188034 W | embed: The scheme of peer url http://localhost:2380 is HTTP while peer key/cert files are presented. Ignored peer key/cert files.\n","stream":"stderr"
,"time":"2018-09-27T09:54:08.188459911Z"}
{"log":"2018-09-27 09:54:08.188038 W | embed: The scheme of peer url http://localhost:2380 is HTTP while client cert auth (--peer-client-cert-auth) is enabled. Ignored client cert auth for 
this url.\n","stream":"stderr","time":"2018-09-27T09:54:08.188463072Z"}
{"log":"2018-09-27 09:54:08.222240 C | etcdmain: listen tcp 172.30.105.230:2380: bind: cannot assign requested address\n","stream":"stderr","time":"2018-09-27T09:54:08.22338831Z"}

which does match the interfaces:

br-80b3dd052912 Link encap:Ethernet  HWaddr 02:42:52:07:B0:BF  
          inet addr:172.18.0.1  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

br-e94472631828 Link encap:Ethernet  HWaddr 02:42:3E:FD:B5:C0  
          inet addr:172.19.0.1  Bcast:172.19.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

docker0   Link encap:Ethernet  HWaddr 02:42:65:D8:4C:63  
          inet addr:172.17.0.1  Bcast:172.17.255.255  Mask:255.255.0.0
          inet6 addr: fe80::42:65ff:fed8:4c63/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:928 (928.0 B)

eth0      Link encap:Ethernet  HWaddr 02:50:00:00:00:01  
          inet addr:192.168.65.3  Bcast:192.168.65.255  Mask:255.255.255.0
          inet6 addr: fe80::50:ff:fe00:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:474 errors:0 dropped:0 overruns:0 frame:0
          TX packets:487 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:42732 (41.7 KiB)  TX bytes:41878 (40.8 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:61919 errors:0 dropped:0 overruns:0 frame:0
          TX packets:61919 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:4733663 (4.5 MiB)  TX bytes:4733663 (4.5 MiB)
@thgsn

This comment has been minimized.

Copy link

commented Sep 29, 2018

I found a solution that worked for me, don't change default Docker subnet 192.168.65.0/24

Version 18.06.1-ce-mac73 (26764)
Version 10.14 (18A391)

@thedanotto

This comment has been minimized.

Copy link

commented Jan 2, 2019

Preferences -> Reset -> Reset to Factory Defaults was the only thing that worked for me.

@Fierozen

This comment has been minimized.

Copy link

commented Jan 27, 2019

Reset to Factory Defaults also worked for me. But I did lose several docker images and containers that I had built.

@micuncang

This comment has been minimized.

@cbmcvey

This comment has been minimized.

Copy link

commented Feb 7, 2019

I had the same experience as Fierozen - I had to reset to factory defaults to finally get it to work. Lost all of my images & containers.... bummer. 👎

@PavelSosin

This comment has been minimized.

Copy link

commented Feb 20, 2019

On Windows docker desktop it simply says:
db39ca38ea95

docker logs db39ca38ea95
Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0220 11:28:58.944710 1 controller.go:119] OpenAPI AggregationController: action for item v1beta2.compose.docker.com: Rate Limited Requeue.
I0220 11:28:59.945728 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.compose.docker.com
E0220 11:28:59.945938 1 controller.go:111] loading OpenAPI spec for "v1beta1.compose.docker.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503,

@PavelSosin

This comment has been minimized.

Copy link

commented Feb 20, 2019

ping v1beta1.compose.docker.com
Ping request could not find host v1beta1.compose.docker.com
Wrong service?

@PavelSosin

This comment has been minimized.

Copy link

commented Feb 20, 2019

I just found that compose now is already on version 3. Do you expect that somebody keeps back compatibility and services on the air 3 versions back?

@xiaods

This comment has been minimized.

Copy link

commented Feb 22, 2019

this is should be fix by aliyun contribute , https://github.com/AliyunContainerService/k8s-for-docker-desktop/blob/master/README_en.md please double check and close it. this issue is out of date right now.

@himanshukandwal

This comment has been minimized.

Copy link

commented Feb 26, 2019

My k8s issue got resolved when I cleared $HOME/Library/Container/com.docker.*, $HOME/.kube directories and then reinstalled the docker desktop edge.

@fouadroumieh

This comment has been minimized.

Copy link

commented Mar 5, 2019

Not sure that as per this link this was supposed to be fixed, yet many people still getting it on the stable version for win10 https://docs.docker.com/docker-for-mac/edge-release-notes/

@marcellodesales

This comment has been minimized.

Copy link

commented Mar 5, 2019

I just did the following on my MacOS High Sierra, Docker Engine 18.06

Problem

  • Verified if the status of the nodes is, starting wit how old the installation is...
$  /usr/local/bin/kubectl.docker  --context docker-for-desktop get nodes
NAME                 STATUS    ROLES     AGE       VERSION
docker-for-desktop   Ready     master    100d      v1.10.3
  • Since it has been running for quite a long-time, I decided to verify the state of the resources of the nodes... It showed something was stuck trying to allocate resources...
$ /usr/local/bin/kubectl.docker  --context docker-for-desktop describe nodes
...
...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  820m (10%)    10m (0%)    130Mi (1%)       190Mi (1%)
Events:
  Type    Reason                   Age              From                         Message
  ----    ------                   ----             ----                         -------
  Normal  Starting                 6m               kubelet, docker-for-desktop  Starting kubelet.
  Normal  NodeAllocatableEnforced  6m               kubelet, docker-for-desktop  Updated Node Allocatable limit across pods
  Normal  NodeHasSufficientDisk    6m (x6 over 6m)  kubelet, docker-for-desktop  Node docker-for-desktop status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  6m (x6 over 6m)  kubelet, docker-for-desktop  Node docker-for-desktop status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    6m (x6 over 6m)  kubelet, docker-for-desktop  Node docker-for-desktop status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     6m (x5 over 6m)  kubelet, docker-for-desktop  Node docker-for-desktop status is now: NodeHasSufficientPID

This means kubernetes can't allocate resources due to disk space and other reasons...

Solution

  • Opened the Docker Engine for Mac
  • Clicked on Reset -> Reset disk image to clean up resources
  • Added a status on the docker ps to see all containers running
$ docker ps
CONTAINER ID        IMAGE                             COMMAND                  CREATED             STATUS              PORTS               NAMES
4c9d0c78f302        k8s.gcr.io/kube-apiserver-amd64   "kube-apiserver --ad…"   2 seconds ago       Up 1 second                             k8s_kube-apiserver_kube-apiserver-docker-for-desktop_kube-system_5798ec7497dde3dcd1765273481d34a5_0
f9f292ea8a95        k8s.gcr.io/pause-amd64:3.1        "/pause"                 8 seconds ago       Up 8 seconds                            k8s_POD_kube-scheduler-docker-for-desktop_kube-system_6d5c9cb98205e46b85b941c8a44fc236_0
a7880e68090a        k8s.gcr.io/pause-amd64:3.1        "/pause"                 8 seconds ago       Up 8 seconds                            k8s_POD_etcd-docker-for-desktop_kube-system_6d86b9595269c3d4ae8fbcc3350af89d_0
3d5aa6d21471        k8s.gcr.io/pause-amd64:3.1        "/pause"                 8 seconds ago       Up 8 seconds                            k8s_POD_kube-controller-manager-docker-for-desktop_kube-system_cd0a522274d18909d85f2e6556c84c31_0
30616d409a36        k8s.gcr.io/pause-amd64:3.1        "/pause"                 8 seconds ago       Up 8 seconds                            k8s_POD_kube-apiserver-docker-for-desktop_kube-system_5798ec7497dde3dcd1765273481d34a5_0
...
...
  • Monitored the docker engine Mac status shows READY in green
  • Verified the API server to confirm
$ kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

All good from this state on... My engine was running for 100d

@l0bster

This comment has been minimized.

Copy link

commented Mar 13, 2019

removing ~/.kube will fix this bullshit docker and it's errors.

this fixed it for me ... thanks!

@marek-obuchowicz

This comment has been minimized.

Copy link

commented Mar 13, 2019

Resetting k8s / removing k8s config files sees to fix the problem, but only for some time. Issue keeps appearing afterwards, so it does not qualify as a sustainable fix :(

@abhishekagarwal87

This comment has been minimized.

Copy link

commented Mar 21, 2019

for me, even the factory reset was not working. Tried that multiple times. my hosts file was as expected as well. In a bid to troubleshoot this further, I enabled the option "Show System Containers" after the last factory reset and viola it worked. I don't know how.

@weipoint

This comment has been minimized.

Copy link

commented Apr 3, 2019

This works for me since in China download k8s use too long time by the "Wall".
Refer: https://github.com/maguowei/k8s-docker-for-mac
Steps:

  1. Install docker for Mac
  2. Set registry mirror "https://registry.docker-cn.com" in Docker preference/Daemon
  3. Run the "load_images.sh" in the Git repo
  4. Restart the Docker

哥们这个有用

哥们,现在这个好像又不管用了

@constantind83

This comment has been minimized.

Copy link

commented Apr 9, 2019

I tried every solution here but it only worked for me after deleting namespaces from /etc/resolv.conf

@rmrfself

This comment has been minimized.

Copy link

commented Apr 12, 2019

I found a solution that worked for me, don't change default Docker subnet 192.168.65.0/24

Version 18.06.1-ce-mac73 (26764)
Version 10.14 (18A391)

Thanks.

@guyeu

This comment has been minimized.

Copy link

commented Apr 12, 2019

I just update my docker for mac, now kubernetes is running...

@zuodimiaoyun

This comment has been minimized.

Copy link

commented Apr 18, 2019

same issue for me, the log shows 'DNS lookup docker-for-desktop.{domain} A: No Such Record' and 'Cannot list nodes: Get https://localhost:6443/api/v1/nodes: EOF'.
guess the problem was caused by I used the corp network, so I changed network to 4G,reinstall docker and kubernetes,restart mac ,then this issue fixed.

@jpreese

This comment has been minimized.

Copy link

commented Apr 24, 2019

To throw this solution in the hat as I don't see it mentioned here, I deleted the pki folder (C:\ProgramData\DockerDesktop\pki) which contains all of the certs for Kubernetes.

After the folder is deleted, restart Docker and the certs should be regenerated.

In my case, Kubernetes was complaining about vm.internal.host not being an allowed host in the cert.

@devserghini

This comment has been minimized.

Copy link

commented Apr 26, 2019

I had the same issue as @jpreese . On Mac, the pki is located here:
~/Library/Group\ Containers/group.com.docker/pki/

@mikeantonelli

This comment has been minimized.

Copy link

commented Jun 18, 2019

I had this issue and after inspecting running containers I was seeing "Connection Refused" errors on port 2379, and the etcd container wasn't being started. My solution:

  • Disable local network (Turn off wifi)
  • Docker > Preferences > Reset > Reset Kubernetes cluster
  • Allow all containers to become healthy (my count was 8 containers)

At this point, the status still said "Kubernetes is starting…", so then I:

  • Re-enabled local network (Enable wifi)
  • Allowed remaining containers to become healthy (new count is 18 running containers)

Note, if you don't have the Docker images cached locally, this solution may not work for you since the Docker daemon will not be able to pull images while the network is disabled. I'm guessing this is what happened between the 8 and 18 containers while my network was disabled.

@Carmon-Lee

This comment has been minimized.

Copy link

commented Jun 22, 2019

Tried almost every method mentioned above, but failed.

@k7faq

This comment has been minimized.

Copy link

commented Jul 2, 2019

Having this same issue on clean install of Docker for Windows 10 immediately following a clean install of Windows 10. This makes two separate incidents of this for me in 2 weeks time.

@monitorjbl

This comment has been minimized.

Copy link

commented Jul 7, 2019

Documenting my solution for others that it might help. I was getting an error similar to @davidkarlsen in which etcdmain would not start up:

etcdmain: listen tcp 23.202.231.166:2380: bind: cannot assign requested address

After many factory resets, I decided to just try to start the container up manually with the same settings k8s was using. I got the same error, and it turns out that the cause of this issue is that k8s does not specify a -initial-advertise-peer-urls option on etcd. The default behavior of etcd is to bind to the addressed specified by your hostname.

This was where things got weird. The hostname for my container (5ff8696dd4fe) was resolving to 23.202.231.166 inside the container. The container's /etc/resolv.conf was using the same DNS server as my host, so I tried pinging 5ff8696dd4fe directly from my host (as in, I opened a Terminal window). It resolved to 23.202.231.166 as well.

It turned out that anything would resolve to 23.202.231.166 unless it was a valid domain. Clearly junk values I generated by flailing on my keyboard resolved to that IP. So, I went to the site in my browser and it redirected to a sketchy domain-parked site. Apparently, this is a thing that Spectrum (an infamously terrible American ISP that millions of people are stuck with because there is no free market for internet services in the US) just does for invalid domains.

Here's what http://dgadfasdf/ resolved to in my browser: http://www.dnsrsearch.com/search/?q=http://dgadfasdf/. And for posterity, this is what it looks like:

image

I think it's pretty fair to assume Spectrum does this to make a few extra bucks off of people's typos. Once I updated my home network to use a better DNS service (I used Cloudflare's 1.1.1.1 and 1.0.0.1) and factory reset one more time, my cluster started working again.

This was one of the most bizarre debugging sessions I've ever had. Seriously, of all the things that could break my Kubernetes cluster, I never would have guessed it would be f*cking capitalism.

@MartinEmrich

This comment has been minimized.

Copy link

commented Jul 8, 2019

Although my cause for my K8s not starting up is something else (and still unsolved), thumbs up to @monitorjbl . Watch out fellow Germans: Deutsche Telekom does the same thing (redirecting DNS failures to a advertising-filled "search assistant"... Gladly it can be disabled in the "Kundencenter" (customer portal) under "Navigationshilfe".

@wjh000123

This comment has been minimized.

Copy link

commented Jul 13, 2019

if you have nginx installed in your mac, and get it running, you need to stop it. For me, I installed from brew, then I stop the nginx service by brew services stop nginx, and then it's OK.

@MadMango

This comment has been minimized.

Copy link

commented Jul 18, 2019

In my case, the reset would result in that state until I ran kubectl config use-context docker-for-desktop, then it started fine after another 'Reset Kubernetes cluster' in the reset settings.

I thought it wouldn't have to rely on the global context but apparently, it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.