Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to create cluster: failed to init node with kubeadm, and KIND_EXPERIMENTAL_PROVIDER=podman #3581

Closed
KubeKyrie opened this issue Apr 16, 2024 · 4 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@KubeKyrie
Copy link

KubeKyrie commented Apr 16, 2024

What happened:

kind export logs:
logs.tar.gz

When running kind create cluster without any other config, it failed. And I have checked https://kind.sigs.k8s.io/docs/user/known-issues/, but no similar issue was found.

And I have tried, v1.26.4 and v1.27.3 also have this error, by --image kindest/node:v1.26.4@sha256xxx.
But v1.26.2 is ok.

error logs:

enabling experimental podman provider
Creating cluster "kind" ...
⢎⡀ Ensuring node image (kindest/node:v1.27.1) 🖼
 ✓ Ensuring node image (kindest/node:v1.27.1) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✗ Starting control-plane 🕹️
Deleted nodes: ["kind-control-plane"]
ERROR: failed to create cluster: failed to init node with kubeadm: command "podman exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0416 09:58:19.166054     252 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0416 09:58:19.168950     252 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.27.1
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0416 09:58:19.198741     252 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0416 09:58:19.624630     252 certs.go:519] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost] and IPs [10.96.0.1 10.89.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0416 09:58:20.302075     252 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0416 09:58:20.637068     252 certs.go:519] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0416 09:58:21.353022     252 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0416 09:58:21.565145     252 certs.go:519] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [10.89.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [10.89.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0416 09:58:23.513008     252 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0416 09:58:23.766844     252 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0416 09:58:23.944150     252 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0416 09:58:24.239178     252 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0416 09:58:24.634047     252 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0416 09:58:24.944819     252 kubelet.go:67] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0416 09:58:25.419502     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.420944     252 certs.go:519] validating certificate period for CA certificate
I0416 09:58:25.421080     252 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0416 09:58:25.421092     252 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0416 09:58:25.421103     252 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0416 09:58:25.421111     252 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0416 09:58:25.421120     252 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0416 09:58:25.429012     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I0416 09:58:25.429067     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.429862     252 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0416 09:58:25.429881     252 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0416 09:58:25.429890     252 manifests.go:125] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0416 09:58:25.429897     252 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0416 09:58:25.429905     252 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0416 09:58:25.429912     252 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0416 09:58:25.429919     252 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0416 09:58:25.432116     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I0416 09:58:25.432157     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.432670     252 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0416 09:58:25.434022     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
W0416 09:58:25.434418     252 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0)
I0416 09:58:25.436095     252 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0416 09:58:25.436189     252 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
I0416 09:58:25.437110     252 loader.go:373] Config loaded from file:  /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0416 09:58:25.448895     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 4 milliseconds
I0416 09:58:25.950968     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds
I0416 09:58:26.451138     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds
I0416 09:58:26.951183     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1
I0416 10:02:25.451866     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
	cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
	cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
	vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
	vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
	vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
	cmd/kubeadm/app/kubeadm.go:50
main.main
	cmd/kubeadm/kubeadm.go:25
runtime.main
	/usr/local/go/src/runtime/proc.go:250
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1598
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
	cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
	cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
	vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
	vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
	vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
	cmd/kubeadm/app/kubeadm.go:50
main.main
	cmd/kubeadm/kubeadm.go:25
runtime.main
	/usr/local/go/src/runtime/proc.go:250
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1598

What you expected to happen:
Kind cluster can be created successfully.

How to reproduce it (as minimally and precisely as possible):
Just run kind create cluster.
Anything else we need to know?:

Environment:

  • kind version: (use kind version):
    kind v0.19.0 go1.20.4 linux/amd64
  • Runtime info: (use docker info, podman info or nerdctl info):
host:
  arch: amd64
  buildahVersion: 1.32.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - memory
  - devices
  - freezer
  - net_cls
  - blkio
  - perf_event
  - hugetlb
  - pids
  - net_prio
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: Unknown
    path: /usr/local/lib/podman/conmon
    version: 'conmon version 2.1.8, commit: 00e08f4a9ca5420de733bf542b930ad58e1a7e7d'
  cpuUtilization:
    idlePercent: 99.9
    systemPercent: 0.05
    userPercent: 0.05
  cpus: 8
  databaseBackend: boltdb
  distribution:
    distribution: centos
    version: "7"
  eventLogger: file
  freeLocks: 2048
  hostname: "666"
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 3.10.0-1160.el7.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 6349561856
  memTotal: 8200577024
  networkBackend: cni
  networkBackendInfo:
    backend: cni
    dns: {}
  ociRuntime:
    name: runc
    package: Unknown
    path: /usr/local/bin/runc
    version: |-
      runc version 1.1.9
      commit: v1.1.9-0-gccaecfcb
      spec: 1.0.2-dev
      go: go1.20.3
      libseccomp: 2.5.4
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: false
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/local/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 0
  uptime: 16h 36m 25.00s (Approximately 0.67 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "true"
    overlay.mount_program:
      Executable: /usr/local/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fuse-overlayfs: version 1.13-dev
        fusermount3 version: 3.16.1
        FUSE library version 3.16.1
        using FUSE kernel interface version 7.38
    overlay.mountopt: nodev,fsync=0
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 53660876800
  graphRootUsed: 7589314560
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /var/run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.7.1
  Built: 0
  BuiltTime: Thu Jan  1 08:00:00 1970
  GitCommit: ""
  GoVersion: go1.19.13
  Os: linux
  OsArch: linux/amd64
  Version: 4.7.1
  • OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kubernetes version: (use kubectl version): v1.27.1
  • Any proxies or other special environment settings?: None
@KubeKyrie KubeKyrie added the kind/bug Categorizes issue or PR as related to a bug. label Apr 16, 2024
@stmcginnis
Copy link
Contributor

Can you fill in more of the "Environment" details from the end of the issue template? That can contain some useful information about your environment to help understand what is going on.

You are also using an older version of kind. So first thing to try would be to upgrade to the latest. Then make sure you are using one of the supported k8s release versions for that release.

Once you upgrade you can run kind create cluster --retain if creation fails. Then kind export logs will get you all of the log output from the creation process where you can track down what is failing. https://kind.sigs.k8s.io/docs/user/known-issues/#troubleshooting-kind

/remove-kind bug
/kind support

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Apr 16, 2024
@KubeKyrie
Copy link
Author

Hi @stmcginnis, thanks for your advice.
And I have added more information about this error in the issue, could you help find out what the problem is?

Next I will upgrade kind and try again.

@aojea
Copy link
Contributor

aojea commented Apr 17, 2024

you need to use the latest stable version and check the images you are using matches the version as per the release notes

You are also using cgroupsv1 that have several open issues, you should also discard that is the problem

@BenTheElder
Copy link
Member

RHEL 7 is #3311 (comment)

I highly recommend using a more recent distro to develop Kubernetes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

5 participants