Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster not starting with DIND setup #625

Closed
fedepaol opened this issue Jun 18, 2019 · 7 comments

Comments

@fedepaol
Copy link

commented Jun 18, 2019

What happened:
Kind 0.3.0 cluster not starting on prow with k8s test images and docker in docker enabled.
Kind 0.2.1 works fine.
We are running our CI in a prow cluster, using cr.io/k8s-testimages/bootstrap:latest as base image with dind enabled.
The cluster does not start.

What you expected to happen:
The cluster to be up & running.

How to reproduce it (as minimally and precisely as possible):
docker run --privileged --rm -it -e DOCKER_IN_DOCKER_ENABLED='true' -v $(pwd):/workspace --entrypoint /usr/local/bin/runner.sh gcr.io/k8s-testimages/bootstrap:latest bash -c "wget https://dl.google.com/go/go1.12.6.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.12.6.linux-amd64.tar.gz && GO111MODUILE='on' /usr/local/go/bin/go get sigs.k8s.io/kind@v0.3.0 && ~/go/bin/kind create cluster --name=fede

Anything else we need to know?:
The control plane node starts correctly. If I look for kubelet logs inside I can see a bunch of

If I bash into it and look for the kubelet logs:


Jun 17 12:38:24 fede-control-plane kubelet[216]: E0617 12:38:24.141929     216 kuberuntime_manager.go:693] createPodSandbox for pod "kube-controller-manager-fede-control-plane_kube-system(ced6e7a763e96d1888013e32a44b1066)" failed: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: failed to mount rootfs component &{overlay overlay [workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/24/work upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/24/fs lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs]}: invalid argument: unknown
Jun 17 12:38:24 fede-control-plane kubelet[216]: E0617 12:38:24.141977     216 pod_workers.go:190] Error syncing pod ced6e7a763e96d1888013e32a44b1066 ("kube-controller-manager-fede-control-plane_kube-system(ced6e7a763e96d1888013e32a44b1066)"), skipping: failed to "CreatePodSandbox" for "kube-controller-manager-fede-control-plane_kube-system(ced6e7a763e96d1888013e32a44b1066)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-controller-manager-fede-control-plane_kube-system(ced6e7a763e96d1888013e32a44b1066)\" failed: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: failed to mount rootfs component &{overlay overlay [workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/24/work upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/24/fs lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs]}: invalid argument: unknown"

And also a bunch of failures while hitting the api (which makes sense since the apiserver is still down).

Environment:

  • kind version: (use kind version):
  • Kubernetes version: (use kubectl version):
  • Docker version: (use docker info):
  • OS (e.g. from /etc/os-release):

@fedepaol fedepaol added the kind/bug label Jun 18, 2019

@BenTheElder

This comment has been minimized.

Copy link
Member

commented Jun 18, 2019

Note #303, we run our official CI on a Prow cluster with that image 😅

@fedepaol

This comment has been minimized.

Copy link
Author

commented Jun 18, 2019

Nice, I'll try to add the mounts and see if it works! Thanks!

@fedepaol

This comment has been minimized.

Copy link
Author

commented Jun 21, 2019

Not working :-(
If I understood correctly @BenTheElder suggestion, I should mount /lib/modules and /sys/fs/cgroup in the container.

I run as follows but still getting the same error:

docker run --privileged --rm -it -e DOCKER_IN_DOCKER_ENABLED="true" -v /lib/modules:/lib/modules -v /sys/fs/cgroup:/sys/fs/cgroup -v $(pwd):/workspace --entrypoint /usr/local/bin/runner.sh gcr.io/k8s-testimages/bootstrap:latest bash -c "wget https://dl.google.com/go/go1.12.6.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.12.6.linux-amd64.tar.gz && GO111MODUILE='on' /usr/local/go/bin/go get sigs.k8s.io/kind@v0.3.0 && ~/go/bin/kind create cluster --name=fede"
.
.
.
Error: failed to create cluster: failed to init node with kubeadm: exit status 1

Docker is pretty recent (18.09.6), the os where I run the container from is a Fedora 30

@BenTheElder

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

@fedepaol does this match your pod config fully? you need to have a volume mounted at /docker-graph (remapped /var/lib/docker for legacy reasons, the test-infra "bootstrap" image is full of cruft like that)

I see no issues with the following:

docker run --privileged --rm -it -e DOCKER_IN_DOCKER_ENABLED="true" -v /lib/modules:/lib/modules -v /sys/fs/cgroup:/sys/fs/cgroup -v $(pwd):/workspace --tmpfs /docker-graph --entrypoint /usr/local/bin/runner.sh gcr.io/k8s-testimages/bootstrap:latest bash -c "wget https://github.com/kubernetes-sigs/kind/releases/download/v0.3.0/kind-linux-amd64 && chmod +x ./kind-linux-amd64 && ./kind-linux-amd64 create cluster --name=fede --loglevel=debug"

NOTE:

  • downloading the kind binary instead of installing go and then go geting to save time
  • using --tmpfs /docker-graph to lazily simulate an emptyDir volume for docker's storage (NOTE: emptyDir is by default disk backed rather than memory backed, but you get the idea)
  • added --loglevel=debug to help see what happens
@BenTheElder

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

similarly no issues with the following

to sort of simulate an empty dir:
docker volume create fede
simulate the pod:
docker run --privileged --rm -it -e DOCKER_IN_DOCKER_ENABLED="true" -v /lib/modules:/lib/modules -v /sys/fs/cgroup:/sys/fs/cgroup -v $(pwd):/workspace -v fede:/docker-graph --entrypoint /usr/local/bin/runner.sh gcr.io/k8s-testimages/bootstrap:latest bash -c "wget https://github.com/kubernetes-sigs/kind/releases/download/v0.3.0/kind-linux-amd64 && chmod +x ./kind-linux-amd64 && ./kind-linux-amd64 create cluster --name=fede --loglevel=debug" # note the -v fede:/docker-graph
simulate cleaning up the empty dir:
docker rm volume fede

@fedepaol

This comment has been minimized.

Copy link
Author

commented Jun 25, 2019

Oh, I was missing the docker-graph dir. Works now!
Thanks and sorry for the interruption.

@fedepaol fedepaol closed this Jun 25, 2019

@BenTheElder

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

Glad it works!! Interesting that it may have worked with kind 0.2 without that ... I would not have expected that to work reliably 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.