Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG REPORT:kubelet cgroup driver #639

Closed
lavender2020 opened this issue Jan 4, 2018 · 43 comments
Closed

BUG REPORT:kubelet cgroup driver #639

lavender2020 opened this issue Jan 4, 2018 · 43 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@lavender2020
Copy link

BUG REPORT

Versions

kubeadm version:1.9.0-00 amd64
kubelet version:1.9.0-00 amd64
kubernetes-cni:0.6.0-00 amd64
docker-ce version:17.12.0ce-0ubuntu amd64
system version:Ubuntu 16.04.3 LTS
Physical machine

Problems

install kubernetes cluster on ubuntu 16.04. When running kubeadm init,there is an error:
[init] This might take a minute or longer if the control plane images have to be pulled.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.

After i saw the syslog /var/log/syslog, got errors as follow:
Jan 04 16:20:58 master03 kubelet[10360]: W0104 16:20:58.268285 10360 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Jan 04 16:20:58 master03 kubelet[10360]: W0104 16:20:58.269487 10360 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Jan 04 16:20:58 master03 kubelet[10360]: I0104 16:20:58.269527 10360 docker_service.go:232] Docker cri networking managed by cni
Jan 04 16:20:58 master03 kubelet[10360]: I0104 16:20:58.274386 10360 docker_service.go:237] Docker Info: &{ID:3XXZ:XEDW:ZDQS:A2MI:5AEN:CFEP:44AQ:YDS4:CRME:UBRS:46LI:MXNS Containers:0 ContainersRunning:0 Cont
Jan 04 16:20:58 master03 kubelet[10360]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

And i checked docker cgroup driver: docker info |grep -i cgroup
Cgroup Driver: systemd

Versions

kubeadm version (use kubeadm version):

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Others:

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

@dixudx
Copy link
Member

dixudx commented Jan 8, 2018

error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

docker info |grep -i cgroup
Cgroup Driver: systemd

I can confirm this.

@lavender2020 You need to manually append --cgroup-driver=systemd to kubelet startup args and reload kubelet unit file to restart the service.

The default driver that the kubelet uses to manipulate cgroups on the host is cgroupfs.

@dixudx
Copy link
Member

dixudx commented Jan 8, 2018

Most people turn to kubeadm mainly to set up a cluster very quickly. It is simple and practical。

@luxas Shall we add a prefilght check on the consistency of cgroup driver between docker and kubelet to give more explicit warnings? Or add another drop-in for kubelet.service? Or just modify /etc/systemd/system/kubelet.service.d/10-kubeadm.conf in place?

But if so, we may need to acquire root privilege to take on these changes.

@dkirrane
Copy link

dkirrane commented Feb 6, 2018

I hit this same issue with kubeadm v1.9.2 but I can see kubelet is configured to use systemd cgroup driver.

kubelet is using --cgroup-driver=systemd

cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

docker info | grep -i cgroup

 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Cgroup Driver: systemd

kubelet logs

I0206 16:20:40.010949    5712 feature_gate.go:220] feature gates: &{{} map[]}
I0206 16:20:40.011054    5712 controller.go:114] kubelet config controller: starting controller
I0206 16:20:40.011061    5712 controller.go:118] kubelet config controller: validating combination of defaults and flags
W0206 16:20:40.015566    5712 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
I0206 16:20:40.019079    5712 server.go:182] Version: v1.9.2
I0206 16:20:40.019136    5712 feature_gate.go:220] feature gates: &{{} map[]}
I0206 16:20:40.019240    5712 plugins.go:101] No cloud provider specified.
W0206 16:20:40.019273    5712 server.go:328] standalone mode, no API client
W0206 16:20:40.041031    5712 server.go:236] No api server defined - no events will be sent to API server.
I0206 16:20:40.041058    5712 server.go:428] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
I0206 16:20:40.041295    5712 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: /
I0206 16:20:40.041308    5712 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s}
I0206 16:20:40.041412    5712 container_manager_linux.go:266] Creating device plugin manager: false
W0206 16:20:40.043521    5712 kubelet_network.go:139] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0206 16:20:40.043541    5712 kubelet.go:571] Hairpin mode set to "hairpin-veth"
I0206 16:20:40.044909    5712 client.go:80] Connecting to docker on unix:///var/run/docker.sock
I0206 16:20:40.044937    5712 client.go:109] Start docker client with request timeout=2m0s
W0206 16:20:40.046785    5712 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
I0206 16:20:40.049953    5712 docker_service.go:232] Docker cri networking managed by kubernetes.io/no-op
I0206 16:20:40.055138    5712 docker_service.go:237] Docker Info: &{ID:ZXWO:G2FL:QM3S:IAWM:ITQL:XHRH:ZA3T:FJMV:5JDW:IMKI:NIFS:2Z4M Containers:8 ContainersRunning:0 ContainersPaused:0 ContainersStopped:8 Images:11 Driver:devicemapper DriverStatus:[[Pool Name docker-253:0-33593794-pool] [Pool Blocksize 65.54 kB] [Base Device Size 10.74 GB] [Backing Filesystem xfs] [Data file /dev/loop0] [Metadata file /dev/loop1] [Data Space Used 1.775 GB] [Data Space Total 107.4 GB] [Data Space Available 14.72 GB] [Metadata Space Used 2.093 MB] [Metadata Space Total 2.147 GB] [Metadata Space Available 2.145 GB] [Thin Pool Minimum Free Space 10.74 GB] [Udev Sync Supported true] [Deferred Removal Enabled true] [Deferred Deletion Enabled true] [Deferred Deleted Device Count 0] [Data loop file /var/lib/docker/devicemapper/devicemapper/data] [Metadata loop file /var/lib/docker/devicemapper/devicemapper/metadata] [Library Version 1.02.140-RHEL7 (2017-05-03)]] SystemStatus:[] Plugins:{Volume:[local] Network:[overlay host null bridge] Authorization:[] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:true NFd:16 OomKillDisable:true NGoroutines:25 SystemTime:2018-02-06T16:20:40.054685386Z LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:3.10.0-693.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc42021a380 NCPU:2 MemTotal:2097782784 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:master1 Labels:[] ExperimentalBuild:false ServerVersion:1.12.6 ClusterStore: ClusterAdvertise: Runtimes:map[docker-runc:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:docker-runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc420472640} LiveRestoreEnabled:false Isolation: InitBinary: ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[seccomp]}
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Version Info:

 kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
kubelet --version
Kubernetes v1.9.2
docker version
Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-71.git3e8e77d.el7.centos.1.x86_64
 Go version:      go1.8.3
 Git commit:      3e8e77d/1.12.6
 Built:           Tue Jan 30 09:17:00 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-71.git3e8e77d.el7.centos.1.x86_64
 Go version:      go1.8.3
 Git commit:      3e8e77d/1.12.6
 Built:           Tue Jan 30 09:17:00 2018
 OS/Arch:         linux/amd64

@dixudx
Copy link
Member

dixudx commented Feb 8, 2018

@dkirrane Have you reloaded the kubelet.service unit file?

Run systemctl daemon-reload. And then systemctl restart kubelet.

@gades
Copy link

gades commented Feb 21, 2018

This issue was not fixed on 1.9.3

Version Info:

kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T11:55:20Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
kubelet --version
Kubernetes v1.9.3
 docker version
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64
 Experimental: false

@dixudx
Copy link
Member

dixudx commented Feb 22, 2018

@gades What's your cgroup driver?

$ docker info | grep -i cgroup

@mas-dse-greina
Copy link

Having the same problem.

docker info | grep -i cgroup
Cgroup Driver: systemd
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
I0227 13:17:43.802942    3493 docker_service.go:237] Docker Info: &{ID:RJUG:6DLB:A4JM:4T6H:JYKO:7JUC:NQCI:SLI2:DC64:ZXOT:DIX6:ASJY Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:0 Driver:overlay DriverStatus:[[Backing Filesystem extfs]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge overlay null host] Authorization:[] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:26 OomKillDisable:true NGoroutines:47 SystemTime:2018-02-27T13:17:43.802488651-08:00 LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:3.10.0-693.11.6.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc42033d7a0 NCPU:64 MemTotal:270186274816 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:param03.lancelot.cluster.bds Labels:[] ExperimentalBuild:false ServerVersion:1.12.6 ClusterStore: ClusterAdvertise: Runtimes:map[docker-runc:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:docker-runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc420360640} LiveRestoreEnabled:false Isolation: InitBinary: ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[seccomp]}
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Is there somewhere else that Kubelet is getting the cgroupfs driver directive?

@dixudx
Copy link
Member

dixudx commented Feb 28, 2018

@mas-dse-greina Please refer to the solution in my comment.

@sri-oc
Copy link

sri-oc commented Mar 1, 2018

@dixudx Even after appending the --cgroup-driver=systemd to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf the problem still persists.

This is the latest file,
[Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true" Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin" Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local" Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt" Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0" Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki" ExecStart= ExecStart=/usr/bin/kubelet --cgroup-driver=systemd $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

PS: It got fixed. After restarting the daemon and kubelet, I've used kubeadm init --pod-network-cidr=10.244.0.0/16

@mas-dse-greina
Copy link

mas-dse-greina commented Mar 1, 2018 via email

@timothysc
Copy link
Member

After you change the unit file you need to systemdctl daemon-reload in order for a change to take affect.

FWIW this is defaulted in the RPMs but not in .debs. Is there any current distribution in main support that doesn't default to systemd now?

/assign @detiber

@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. triaged labels Mar 7, 2018
@FrostyLeaf
Copy link

I hit this same issue with kubeadm v1.9.3 and v1.9.4.

Start kubelet with --cgroup-driver=systemd

$ cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

Reload service

$ systemctl daemon-reload
$ systemctl restart kubelet

Check docker info

$ docker info |grep -i cgroup
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Cgroup Driver: systemd

kubelet logs

$ kubelet logs
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Version Info

$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T11:55:20Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
$ kubelet --version
Kubernetes v1.9.3
$ docker version
Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-71.git3e8e77d.el7.centos.1.x86_64
 Go version:      go1.8.3
 Git commit:      3e8e77d/1.12.6
 Built:           Tue Jan 30 09:17:00 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-71.git3e8e77d.el7.centos.1.x86_64
 Go version:      go1.8.3
 Git commit:      3e8e77d/1.12.6
 Built:           Tue Jan 30 09:17:00 2018
 OS/Arch:         linux/amd64
$ cat /etc/redhat-release 
CentOS Linux release 7.2.1511 (Core) 

@bart0sh
Copy link

bart0sh commented Mar 20, 2018

@FrostyLeaf Can you look at the command line of running kubelet to see if cgroup driver is specified there?

Something like ps aux |grep kubelet or cat /proc/<kubelet pid>/cmdline should help you to see that.

@FrostyLeaf
Copy link

FrostyLeaf commented Mar 20, 2018

@bart0sh This is it:

$  ps aux |grep /bin/kubelet
root     13025  0.0  0.0 112672   980 pts/4    S+   01:49   0:00 grep --color=auto /bin/kubelet
root     30495  4.5  0.6 546152 76924 ?        Ssl  00:14   4:22 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=systemd --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki --fail-swap-on=false

@bart0sh
Copy link

bart0sh commented Mar 20, 2018

@FrostyLeaf Thank you! I could reproduce this as well. Seems to be a bug. Looking at it.

As a temporary workaround you can switch docker and kubelet to cgroupfs driver. It should work.

@FrostyLeaf
Copy link

@bart0sh Fine. Thanks a lot. I'll try that.

@fredmj
Copy link

fredmj commented Mar 20, 2018

Same here.

Context Host=CentOS 7.4, Guest=VirtualBox=Version 5.2.8 r121009 (Qt5.6.1)

[root@kubernetes ~]# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 
[root@kubernetes ~]# kubelet --version
Kubernetes v1.9.4
[root@kubernetes ~]# docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: <unknown>
 Go version:      go1.8.3
 Git commit:      774336d/1.13.1
 Built:           Wed Mar  7 17:06:16 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: <unknown>
 Go version:      go1.8.3
 Git commit:      774336d/1.13.1
 Built:           Wed Mar  7 17:06:16 2018
 OS/Arch:         linux/amd64
 Experimental:    false
[root@kubernetes ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.4", GitCommit:"bee2d1505c4fe820744d26d41ecd3fdd4a3d6546", GitTreeState:"clean", BuildDate:"2018-03-12T16:21:35Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

docker Cgroup is systemd

 [root@kubernetes ~]# docker info | grep Cgroup
  WARNING: You're not using the default seccomp profile
Cgroup Driver: systemd

kubelet.service started with the Cgroup=systemd

[root@kubernetes ~]# grep cgroup /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

systemctl reload & restart kubelet service

[root@kubernetes ~]# systemctl daemon-reload
[root@kubernetes ~]# systemctl stop kubelet.service
[root@kubernetes ~]# systemctl start kubelet.service

kubelet logs

[root@kubernetes ~]# kubelet logs
I0318 02:07:10.006151   29652 feature_gate.go:226] feature gates: &{{} map[]}
I0318 02:07:10.006310   29652 controller.go:114] kubelet config controller: starting controller
I0318 02:07:10.006315   29652 controller.go:118] kubelet config controller: validating combination of defaults and flags
I0318 02:07:10.018880   29652 server.go:182] Version: v1.9.4
I0318 02:07:10.018986   29652 feature_gate.go:226] feature gates: &{{} map[]}
I0318 02:07:10.019118   29652 plugins.go:101] No cloud provider specified.
W0318 02:07:10.019239   29652 server.go:328] standalone mode, no API client
W0318 02:07:10.068650   29652 **server.go:236] No api server defined - no events will be sent to API server.**
I0318 02:07:10.068670   29652 **server.go:428] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /**
I0318 02:07:10.069130   29652 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: /
I0318 02:07:10.069306   29652 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s}
I0318 02:07:10.069404   29652 container_manager_linux.go:266] Creating device plugin manager: false
W0318 02:07:10.072836   29652 kubelet_network.go:139] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0318 02:07:10.072860   29652 kubelet.go:576] Hairpin mode set to "hairpin-veth"
I0318 02:07:10.075139   29652 client.go:80] Connecting to docker on unix:///var/run/docker.sock
I0318 02:07:10.075156   29652 client.go:109] Start docker client with request timeout=2m0s
I0318 02:07:10.080336   29652 docker_service.go:232] Docker cri networking managed by kubernetes.io/no-op
I0318 02:07:10.090943   29652 docker_service.go:237] Docker Info: &{ID:DUEI:P7Y3:JKGP:XJDI:UFXG:NAOX:K7ID:KHCF:PCGW:46QA:TQZB:WEXF Containers:18 ContainersRunning:17 ContainersPaused:0 ContainersStopped:1 Images:11 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:89 OomKillDisable:true NGoroutines:98 SystemTime:2018-03-18T02:07:10.083543475+01:00 LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:3.10.0-693.21.1.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc42027b810 NCPU:2 MemTotal:2097364992 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:kubernetes.master Labels:[] ExperimentalBuild:false ServerVersion:1.13.1 ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]} docker-runc:{Path:/usr/libexec/docker/docker-runc-current Args:[]}] DefaultRuntime:docker-runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc4202a8f00} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1} RuncCommit:{ID:N/A Expected:9df8b306d01f59d3a8029be411de015b7304dd8f} InitCommit:{ID:N/A Expected:949e6facb77383876aeff8a6944dde66b3089574} SecurityOptions:[name=seccomp,profile=/etc/docker/seccomp.json name=selinux]}
**error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"**

kube processes running

[root@kubernetes ~]# ps aux | grep -i kube
root     10182  0.4  1.2  54512 25544 ?        Ssl  mars17   1:10 kube-scheduler --leader-elect=true --kubeconfig=/etc/kubernetes/scheduler.conf --address=127.0.0.1
root     10235  1.8 12.7 438004 261948 ?       Ssl  mars17   4:44 kube-apiserver --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --requestheader-group-headers=X-Remote-Group --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-allowed-names=front-proxy-client --service-account-key-file=/etc/kubernetes/pki/sa.pub --client-ca-file=/etc/kubernetes/pki/ca.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --insecure-port=0 --enable-bootstrap-token-auth=true --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --secure-port=6443 --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --advertise-address=192.168.1.70 --service-cluster-ip-range=10.96.0.0/12 --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379
root     10421  0.1  1.0  52464 22052 ?        Ssl  mars17   0:20 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
root     12199  1.7  8.5 326552 174108 ?       Ssl  mars17   4:11 kube-controller-manager --address=127.0.0.1 --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --use-service-account-credentials=true --kubeconfig=/etc/kubernetes/controller-manager.conf --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key
root     22928  0.0  1.0 279884 20752 ?        Sl   01:10   0:00 /home/weave/weaver --port=6783 --datapath=datapath --name=fe:9b:da:25:e2:b2 --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=10.32.0.0/12 --nickname=kubernetes.master --ipalloc-init consensus=1 --conn-limit=30 --expect-npc 192.168.1.70
root     23308  0.0  0.7  38936 15340 ?        Ssl  01:10   0:01 /kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
65534    23443  0.0  0.8  37120 18028 ?        Ssl  01:10   0:03 /sidecar --v=2 --logtostderr --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
root     29547  1.6  2.9 819012 61196 ?        Ssl  02:07   0:22 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki

@FrostyLeaf
Copy link

v1.9.5 fixed this issue, awesome!@bart0sh

@bart0sh
Copy link

bart0sh commented Mar 21, 2018

@FrostyLeaf I'm still able to reproduce it with 1.9.5:

$ rpm -qa |grep kube
kubeadm-1.9.5-0.x86_64
kubelet-1.9.5-0.x86_64
kubernetes-cni-0.6.0-0.x86_64
kubectl-1.9.5-0.x86_64

$ docker info 2>/dev/null |grep -i cgroup
Cgroup Driver: systemd

$ ps aux |grep cgroup-driver
root 29078 1.9 0.1 1222632 91824 ? Ssl 13:45 0:04 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=systemd --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki

I0321 13:50:29.901008 30817 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:} {Signal:nodefs.available Operator:LessThan Value:{Quantity: Percentage:0.1} GracePeriod:0s MinReclaim:} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity: Percentage:0.05} GracePeriod:0s MinReclaim:} {Signal:imagefs.available Operator:LessThan Value:{Quantity: Percentage:0.15} GracePeriod:0s MinReclaim:}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s}
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Are you still using systemd cgroup driver?

@bart0sh
Copy link

bart0sh commented Mar 21, 2018

I propose to close this issue

I've observed 2 reasons that cause most of the reports here:

  1. forgetting to run 'systemctl daemon-reload' after editing systemd drop-ins. Eventhough -cgroup-driver=systemd was added to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf it didn't make any effect and default(or previously specified with --cgroup-driver) driver was used.

  2. running 'kubelet logs' command to see kubelet logs. 'logs' subcommand doesn't exist in kubelet, so 'kubelet logs' and 'kubelet' are the same commands. 'kubelet logs' runs kubelet with the default cgroup driver 'cgroupfs' and kubelet complains about inconsistency between kubelet and docker drivers. 'journalctl -ux kubelet' should be used to see the logs.

I tested --cgroup-driver=systemd option with kubelet 1.8.0, 1.9.0, 1.9.3 and 1.9.5. There were no error messages "cgroupfs is different from docker cgroup driver: systemd" in the logs.

@bart0sh
Copy link

bart0sh commented Mar 22, 2018

@timothysc There are no objections regarding my last comment. Can you close this issue, please? It's not a bug, as it's caused by a lack of knowledge about kubelet and/or systemd.

2 things that might make sense to do from my point of view are:

  • implement preflight check for "kubeadm init" to check if docker and kubelet cgroup drivers match.
  • make kubelet fail if it finds unknown parameter in the command line, e.g. "kubelet logs" should fail with an error message "unrecognised parameter: logs"

We may want to consider creating separate issues for those.

Anyway, this issue can be closed.

@fredmj
Copy link

fredmj commented Mar 22, 2018

Things look fine for me thanks to the v1.9.5.

Agree with @bart0sh about the init checking the cgroup driver consistency between kubelet and docker.
Maybe the `kublet logs{ should be point to the journactl -u kubelet.service

Just my 2cts.

@moqichenle
Copy link

moqichenle commented Mar 26, 2018

Hi, I m having the same issue.
Centos 7
kubeadm version is : 1.9.6
docker version is: 1.13.1 API version: 1.26
when I ran : docker info | grep -i cgroup,
I got this:
WARNING: You're not using the default seccomp profile Cgroup Driver: systemd

when I run cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf,
I can see the settings Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd" in place.

I did run **systemctl daemon-reload ** and systemctl restart kubelet, but it still shown

misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Another weird thing is : when I ran sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
I saw the --cgroup-drive is changed to cgroupfs.
But then the exact same error msg was shown when I ran kubelet status again.

misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

I cannot figure out the problem.
I will try with versions mentioned above. Does anyone know how to install an older version of kubernetes? Thank you.

@bart0sh
Copy link

bart0sh commented Mar 26, 2018

@moqichenle That's strange. It should work. Can you show the output of the following commands?

systemctl daemon-reload
systemctl restart kubelet
docker info 2>/dev/null |grep -i group
ps aux |grep group-driver
journalctl -u kubelet.service | grep "is different from docker cgroup driver"

Here is what I see on my system:

# systemctl daemon-reload
# systemctl restart kubelet
# docker info 2>/dev/null |grep -i group
Cgroup Driver: systemd
# ps aux |grep group-driver
root     25062  5.7  0.1 983760 78888 ?        Ssl  15:26   0:00 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=NN.NN.NN.NN --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=systemd --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki
root     25520  0.0  0.0   9288  1560 pts/0    R+   15:26   0:00 grep --color=auto group-driver
# journalctl -u kubelet.service | grep "is different from docker cgroup driver"
#

@moqichenle
Copy link

moqichenle commented Mar 26, 2018

@bart0sh Hi, thank you for the help.
This is what I have (before starting kubeadm init):
[root@localhost bin]# docker info 2>/dev/null |grep -i group Cgroup Driver: systemd [root@localhost bin]# ps aux |grep group-driver root 13472 0.0 0.1 12476 984 pts/0 R+ 13:23 0:00 grep --color=auto group-driver

After typing command kubeadm init,
this is what I have:
[vagrant@localhost ~]$ ps aux |grep group-driver root 13606 5.1 4.5 605240 22992 ? Ssl 13:25 0:03 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --cgroup-driver=systemd --hostname-override=default vagrant 13924 0.0 0.1 12476 984 pts/1 R+ 13:26 0:00 grep --color=auto group-driver

But then kubeadm init will fail because of either kubelet is not healthy or kubelet is not running.

@bart0sh
Copy link

bart0sh commented Mar 26, 2018

@moqichenle did you run systemctl daemon-reload and systemctl restart kubelet before running kubeadm init?

Can you run journalctl -u kubelet.service after kubeadm init and show its output here?

@moqichenle
Copy link

Yes I did run the two commands before the init.
Strange thing: I didn't see any output when I ran journalctl -u kubelet.service | grep "is different from docker cgroup driver".
I only saw that the error when I ran kubelet status.

@bart0sh
Copy link

bart0sh commented Mar 26, 2018

@moqichenle kubelet status command doesn't exist. That means you run kubelet with the default parameters(and default cgroup driver). That's why you're getting the error. See my messages regarding kubelet logs for more details.

Do you see anything suspicious (errors, warnings) in the output of journalctl -u kubelet.service?

@moqichenle
Copy link

moqichenle commented Mar 26, 2018

Ah, I see. Thank you. :)
hmm.. there are errors shown below:
Mar 26 13:39:40 localhost.localdomain kubelet[13606]: E0326 13:39:34.198202 13606 kuberuntime_image.go:140] ListImages failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:39:45 localhost.localdomain kubelet[13606]: E0326 13:39:44.824222 13606 kubelet.go:1259] Container garbage collection failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:39:47 localhost.localdomain kubelet[13606]: W0326 13:39:44.749819 13606 image_gc_manager.go:173] [imageGCManager] Failed to monitor images: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:39:49 localhost.localdomain kubelet[13606]: E0326 13:39:49.486990 13606 kubelet.go:1281] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get image stats: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.934312 13606 remote_runtime.go:169] ListPodSandbox with filter nil from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.934359 13606 kuberuntime_sandbox.go:192] ListPodSandbox failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.934374 13606 generic.go:197] GenericPLEG: Unable to retrieve pods: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.936761 13606 remote_image.go:67] ListImages with filter nil from image service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.936788 13606 kuberuntime_image.go:106] ListImages failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: W0326 13:42:03.936795 13606 image_gc_manager.go:184] [imageGCManager] Failed to update image list: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.937002 13606 remote_runtime.go:69] Version from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded Mar 26 13:42:03 localhost.localdomain kubelet[13606]: E0326 13:42:03.937020 13606 kuberuntime_manager.go:245] Get remote runtime version failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded

When I ran kubeadm init, if the cgroup drive settings are different:
It shows:
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml" [init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests". [init] This might take a minute or longer if the control plane images have to be pulled. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.

When the cgroup drive settings are the same,
it just hangs at the step of pulling control panel and end up with kubelet unhealthy or not running.

@bart0sh
Copy link

bart0sh commented Mar 26, 2018

@moqichenle it looks like a docker issue to me. It's not related to this one I believe.

You can search for "context deadline exceeded" for more info.

@moqichenle
Copy link

@bart0sh Yep, Don't think it s related to this issue anymore. will do. Thank you very much :D

@bart0sh
Copy link

bart0sh commented Mar 28, 2018

this PR should help to decrease confusion caused by running 'kubelet logs', 'kubelet status' and other non-existing kubelet commands: #61833

It makes kubelet to produce error and exit if it's run with incorrect command line.

Please, review.

@timothysc timothysc added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed triaged labels Apr 6, 2018
@timothysc timothysc added this to the v1.11 milestone Apr 6, 2018
@wshandao
Copy link

Hi, I can reproduce this issue on 1.10, just to check is this a bug and will be fixed in v1.11?

@dixudx
Copy link
Member

dixudx commented Apr 23, 2018

is this a bug and will be fixed in v1.11

IMO this is a configuration mismatch between docker and kubelet, rather than a bug.

Before running kubeadm init, a prerequisite check on cgroup driver should be done.

@wshandao
Copy link

@dixudx I‘m trying to install the k8s followed by installation guide from website https://kubernetes.io/docs/setup/independent/install-kubeadm/,and steps were hold by this issue, below is the details of my enviroment;

OS:

CentOS Linux release 7.4.1708 (Core)

Docker:

Server Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version:
Go version: go1.8.3
Git commit: 774336d/1.13.1
Built: Wed Mar 7 17:06:16 2018
OS/Arch: linux/amd64
Experimental: false

K8S:
kubeadm.x86_64                     1.10.1-0
kubectl.x86_64 1.10.1-0
kubelet.x86_64 1.10.1-0
kubernetes-cni.x86_64 0.6.0-0

The cgroup between docker and kubelet

docker info | grep -i cgroup
WARNING: You're not using the default seccomp profile
Cgroup Driver: systemd


cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep -i cgroup
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"

It's the same cgroup as systemd, hence no need to adjust cgroup of kubelet manually. And I start to run kubelet but failed due to error message as mentioned

[root@K8S-Master /]# kubelet logs
I0424 10:41:29.240854   19245 feature_gate.go:226] feature gates: &{{} map[]}
W0424 10:41:29.247770   19245 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
W0424 10:41:29.253069   19245 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
I0424 10:41:29.253111   19245 server.go:376] Version: v1.10.1
I0424 10:41:29.253175   19245 feature_gate.go:226] feature gates: &{{} map[]}
I0424 10:41:29.253290   19245 plugins.go:89] No cloud provider specified.
W0424 10:41:29.253327   19245 server.go:517] standalone mode, no API client
W0424 10:41:29.283851   19245 server.go:433] No api server defined - no events will be sent to API server.
I0424 10:41:29.283867   19245 server.go:613] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
I0424 10:41:29.284091   19245 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: /
I0424 10:41:29.284101   19245 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:} {Signal:nodefs.available Operator:LessThan Value:{Quantity: Percentage:0.1} GracePeriod:0s MinReclaim:} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity: Percentage:0.05} GracePeriod:0s MinReclaim:} {Signal:imagefs.available Operator:LessThan Value:{Quantity: Percentage:0.15} GracePeriod:0s MinReclaim:}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true}
I0424 10:41:29.284195   19245 container_manager_linux.go:266] Creating device plugin manager: true
I0424 10:41:29.284242   19245 state_mem.go:36] [cpumanager] initializing new in-memory state store
I0424 10:41:29.284292   19245 state_mem.go:87] [cpumanager] updated default cpuset: ""
I0424 10:41:29.284326   19245 state_mem.go:95] [cpumanager] updated cpuset assignments: "map[]"
W0424 10:41:29.286890   19245 kubelet_network.go:139] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0424 10:41:29.286912   19245 kubelet.go:556] Hairpin mode set to "hairpin-veth"
I0424 10:41:29.288233   19245 client.go:75] Connecting to docker on unix:///var/run/docker.sock
I0424 10:41:29.288268   19245 client.go:104] Start docker client with request timeout=2m0s
W0424 10:41:29.289762   19245 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
W0424 10:41:29.292669   19245 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
I0424 10:41:29.293904   19245 docker_service.go:244] Docker cri networking managed by kubernetes.io/no-op
I0424 10:41:29.302849   19245 docker_service.go:249] Docker Info: &{ID:UJ6K:K2AW:HKQY:5MRL:KROX:FTJV:3TKY:GHGI:L7GV:UQFP:AU2Q:AKC6 Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:0 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:16 OomKillDisable:true NGoroutines:26 SystemTime:2018-04-24T10:41:29.295491971+08:00 LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:3.10.0-693.5.2.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4203dcbd0 NCPU:4 MemTotal:8371650560 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:K8S-Master Labels:[] ExperimentalBuild:false ServerVersion:1.13.1 ClusterStore: ClusterAdvertise: Runtimes:map[docker-runc:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:docker-runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc4201317c0} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1} RuncCommit:{ID:N/A Expected:9df8b306d01f59d3a8029be411de015b7304dd8f} InitCommit:{ID:N/A Expected:949e6facb77383876aeff8a6944dde66b3089574} SecurityOptions:[name=seccomp,profile=/etc/docker/seccomp.json name=selinux]}
F0424 10:41:29.302989   19245 server.go:233] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

The key info I see from log is CgroupDriver still cgroupfs, I guess that's the reason caused cgroup mismatch issue, but have no idea how to adjust this default value? can you help to clarify for it, thanks!

@dixudx
Copy link
Member

dixudx commented Apr 24, 2018

@wshandao Please stop using kubelet logs, which is not the right way to see the log.

The correct way to see the log is journalctl -f -u kubelet.

@wshandao
Copy link

Thanks @dixudx, my mistake and this is not actually issue to hold my installation

@timothysc timothysc assigned timothysc and unassigned detiber Apr 26, 2018
@timothysc timothysc added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Apr 26, 2018
@neolit123
Copy link
Member

i second the requests to close this one.
the documentation already covers that the users need to verify a matching cgroups driver.

this is independent of kubeadm and is more of a kubelet vs docker issue.

similar reports:
kubernetes/kubernetes#59794
openshift/origin#18776
kubernetes/kubernetes#43805

@timstclair

FWIW this is defaulted in the RPMs but not in .debs. Is there any current distribution in main support that doesn't default to systemd now?

i have tested this on 3 different bare-bone Ubuntu 16.04.2, 16.04.0, 17.04 and it appears that the docker driver is cgroupfs, which matches the default argument value of the kublet.

unlike the user report in the original post where docker is using systemd on 16.04.3, so it could be the docker config changing between docker versions. hard to tell.

given my tests i don't see a need to add Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd" in the debs because that would be wrong for these Ubuntu versions at least.

what the kublet should probably do for a friendly UX is to always match the docker driver automatically.

@timothysc
Copy link
Member

@neolit123 agreed.

However I do think we should open a troubleshooting doc issue JIC.
Closing this one and I'll start my docs.

@dragosrosculete
Copy link

dragosrosculete commented Jun 12, 2018

I had this same problem on Ubuntu 16.04, Kube version v1.10.4 . Docker version 1.13.1
Docker was starting with native.cgroupdriver=systemd . This config was set by me in /etc/docker/daemon.json

{
"exec-opts": ["native.cgroupdriver=systemd"]
}

I have modified the config in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Added a new line: Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
And add the parameter $KUBELET_CGROUP_ARGS in ExecStart

Then did a systemctl daemon-reload and service kubelet restart .
Kubelet started correctly .

@neolit123
Copy link
Member

@dragosrosculete

we are improving our troubleshooting docs, but also in 1.11 and later the cgroup driver for docker should be automatically matched by kubeadm.

@iknownothing
Copy link

I do think it's a bug. I checked the docker version and kubeadm file, of course the kubeadm script does that check too. however i get the mismatch err msg. If someone ever read carefully you can see some of above has the issue AFTER correctly set the parameter.

@Pezhvak
Copy link

Pezhvak commented Dec 20, 2018

this is still happening, nothing worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests