New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3 control-plane node setup not starting #2744
Comments
do you have enough resources, that setup will require a some amount of RAM and CPU |
the node is with 2 cpu and 10gb ram. wont this be sufficient to start the cluster? i have successfully created 1 controller and 3 worker on this same node with out any issues. the problem comes when i increase the controller to 3 and worker to 1 |
I can't say with this data, but just try creating a cluster with less components so you can discard is a resource problem |
1 controller and 3 worker cluster is coming up fine. kind create cluster --name single --config=single-controller-multi-worker.yaml kubectl cluster-info --context kind-single Not sure what to do next? 😅 Check out https://kind.sigs.k8s.io/docs/user/quick-start/ kubectl get nodes this is the kind config that i used to create this cluster cat single-controller-multi-worker.yaml a cluster with 1 control-plane nodes and 3 workerskind: Cluster
|
3 control-planes = 3 etcd 💣 😄 |
i tested this with kind version v0.11.1 on the same node and its working as expected. ./kind-linux-amd64 create cluster --name multi --config multi-controller-worker.yaml kubectl cluster-info --context kind-multi Thanks for using kind! 😊 kubectl get nodes ./kind-linux-amd64 version cat multi-controller-worker.yaml a cluster with 3 control-plane nodes and 1 workerskind: Cluster
|
This is most likely a resource issue, resource has many dimensions besides CPU and RAM, e.g. disk IO for the kubernetes apiservers (by way of etcd) which are write heavy. If you run |
FWIW: I highly recommend single node clusters unless you have a specific use case that absolutely requires the three control planes. |
@BenTheElder thanks "--retain" helped me to retain the cluster. seems like kube-proxy is failing to start with error "command failed" err="failed complete: too many open files". i will do some more troubleshooting. |
i googled and found that the above issue is caused by ulimit. I have updated the ulimit for max_user_watches and max_user_instances to a higher value and now the kind cluster is coming up without any problem. echo fs.inotify.max_user_watches=655360 | sudo tee -a /etc/sysctl.conf closing this issue since the above commands fixed this. |
Prompted by seeing failures which could be caused by kubernetes-sigs/kind#2744 Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
Prompted by seeing failures which could be caused by kubernetes-sigs/kind#2744 Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
I ran into this issue when attempting to deploy a large kind cluster with 25 worker nodes and this also fixed my issue. |
This works for me too: |
I am trying to launch a 3 controller and 1 worker kind cluster on ubuntu 22.04. its failing to start. Below is the config yaml that i used to create cluster.
a cluster with 3 control-plane nodes and 1 workers
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
cluster create is failing with the below error. (logs are trimmed)
kind create cluster --name multi --config multi-controller-worker.yaml
Creating cluster "multi" ...
✓ Ensuring node image (kindest/node:v1.23.4) 🖼
✓ Preparing nodes 📦 📦 📦 📦
✓ Configuring the external load balancer ⚖️
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining more control-plane nodes 🎮
✗ Joining worker nodes 🚜
ERROR: failed to create cluster: failed to join node with kubeadm: command "docker exec --privileged multi-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" failed with error: exit status 1
Command Output: I0510 14:08:38.916853 132 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
I0510 14:08:38.924953 132 joinconfiguration.go:76] loading configuration from "/kind/kubeadm.conf"
I0510 14:08:38.926027 132 controlplaneprepare.go:220] [download-certs] Skipping certs download
I0510 14:08:38.926052 132 join.go:530] [preflight] Discovering cluster-info
I0510 14:08:38.926062 132 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "multi-external-load-balancer:6443"
I0510 14:08:38.941343 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s 200 OK in 14 milliseconds
I0510 14:08:38.942048 132 token.go:105] [discovery] Cluster info signature and contents are valid and no TLS pinning was specified, will use API Server "multi-external-load-balancer:6443"
I0510 14:08:38.942071 132 discovery.go:52] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I0510 14:08:38.942078 132 join.go:544] [preflight] Fetching init configuration
I0510 14:08:38.942081 132 join.go:590] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
I0510 14:08:38.969023 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s 200 OK in 26 milliseconds
I0510 14:08:38.973664 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kube-proxy?timeout=10s 200 OK in 3 milliseconds
I0510 14:08:38.974834 132 kubelet.go:91] attempting to download the KubeletConfiguration from the new format location (UnversionedKubeletConfigMap=true)
I0510 14:08:38.976505 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config?timeout=10s 403 Forbidden in 1 milliseconds
I0510 14:08:38.989390 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config?timeout=10s 403 Forbidden in 0 milliseconds
I0510 14:08:39.048096 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config?timeout=10s 403 Forbidden in 1 milliseconds
I0510 14:08:39.311348 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config?timeout=10s 403 Forbidden in 2 milliseconds
I0510 14:08:39.311647 132 kubelet.go:94] attempting to download the KubeletConfiguration from the DEPRECATED location (UnversionedKubeletConfigMap=false)
I0510 14:08:39.314451 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config-1.23?timeout=10s 200 OK in 2 milliseconds
I0510 14:08:39.315764 132 interface.go:432] Looking for default routes with IPv4 addresses
I0510 14:08:39.315785 132 interface.go:437] Default route transits interface "eth0"
I0510 14:08:39.315942 132 interface.go:209] Interface eth0 is up
I0510 14:08:39.315996 132 interface.go:257] Interface "eth0" has 3 addresses :[172.18.0.5/16 fc00:f853:ccd:e793::5/64 fe80::42:acff:fe12:5/64].
I0510 14:08:39.316007 132 interface.go:224] Checking addr 172.18.0.5/16.
I0510 14:08:39.316012 132 interface.go:231] IP found 172.18.0.5
I0510 14:08:39.316029 132 interface.go:263] Found valid IPv4 address 172.18.0.5 for interface "eth0".
I0510 14:08:39.316034 132 interface.go:443] Found active IP 172.18.0.5
I0510 14:08:39.321562 132 kubelet.go:119] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0510 14:08:39.322302 132 kubelet.go:134] [kubelet-start] writing CA certificate at /etc/kubernetes/pki/ca.crt
I0510 14:08:39.322814 132 loader.go:372] Config loaded from file: /etc/kubernetes/bootstrap-kubelet.conf
I0510 14:08:39.323308 132 kubelet.go:155] [kubelet-start] Checking for an existing Node in the cluster with name "multi-worker" and status "Ready"
I0510 14:08:39.326531 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 3 milliseconds
I0510 14:08:39.326759 132 kubelet.go:170] [kubelet-start] Stopping the kubelet
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0510 14:08:44.471944 132 loader.go:372] Config loaded from file: /etc/kubernetes/kubelet.conf
I0510 14:08:44.473636 132 loader.go:372] Config loaded from file: /etc/kubernetes/kubelet.conf
I0510 14:08:44.474330 132 kubelet.go:218] [kubelet-start] preserving the crisocket information for the node
I0510 14:08:44.474383 132 patchnode.go:31] [patchnode] Uploading the CRI Socket information "unix:///run/containerd/containerd.sock" to the Node API object "multi-worker" as an annotation
I0510 14:08:44.474432 132 cert_rotation.go:137] Starting client certificate rotation controller
I0510 14:08:45.125555 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 150 milliseconds
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0510 14:09:19.486825 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 11 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0510 14:09:24.479348 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 4 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0510 14:09:34.480058 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 5 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0510 14:09:54.481436 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 4 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0510 14:10:34.483231 132 round_trippers.go:553] GET https://multi-external-load-balancer:6443/api/v1/nodes/multi-worker?timeout=10s 404 Not Found in 8 milliseconds
nodes "multi-worker" not found
error uploading crisocket
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runKubeletStartJoinPhase
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join/kubelet.go:220
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:178
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:255
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1581
error execution phase kubelet-start
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:178
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:255
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1581
Environment:
kind version
): v0.12.0kubectl version
): v1.24.0docker info
): Server Version: 20.10.12/etc/os-release
): Ubuntu 22.04 LTSThe expectation is that kind should be able to create cluster with multiple controller and multiple workers
The text was updated successfully, but these errors were encountered: