Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods with memory requests/limits set cannot start on 1.23.1 + ubuntu 2004 + cri-o 1.23.0 #5527

Closed
glitchcrab opened this issue Jan 3, 2022 · 23 comments

Comments

@glitchcrab
Copy link

Description

This cluster was originally created with k8s 1.22.2 on ubuntu 2004 vms using kubeadm with no special config. When upgrading to 1.23.1, pods with resource requests and/or limits set fail to start with the following error:

Error: container create failed: time="2022-01-03T10:22:18Z" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: open /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2f5254d9_5b91_4987_8ea9_ddf323e3623b.slice/crio-35fa56fb8d2ad4995c85027176ef30ffcddc67f6ec96a8ea34ba4318670b7d99.scope/memory.memsw.limit_in_bytes: no such file or directory"

cri-o logs show the following:

Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.927188409Z" level=info msg="Checking image status: k8s.gcr.io/coredns/coredns:v1.8.6" id=278e27cc-031c-44f5-8773-c5c65133ead0 name=/runtime.v1.ImageService/ImageStatus
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.928357846Z" level=info msg="Image status: &ImageStatusResponse{Image:&Image{Id:a4ca41631cc7ac19ce1be3ebf0314ac5f47af7c711f17066006db82ee3b75b03,RepoTags:[k8s.gcr.io/coredns/coredns:v1.8.6],RepoDigests:[k8s.gcr.io/coredns/coredns@sha256:5b6ec0d6de9baaf3e92d0f66cd96a25b9edbce8716f5f15dcd1a616b3abd590e k8s.gcr.io/coredns/coredns@sha256:8916c89e1538ea3941b58847e448a2c6d940c01b8e716b20423d2d8b189d3972],Size_:46959895,Uid:nil,Username:,Spec:nil,},Info:map[string]string{},}" id=278e27cc-031c-44f5-8773-c5c65133ead0 name=/runtime.v1.ImageService/ImageStatus
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.929932099Z" level=info msg="Checking image status: k8s.gcr.io/coredns/coredns:v1.8.6" id=b995fb6f-8820-4d3e-995a-87498db9e66a name=/runtime.v1.ImageService/ImageStatus
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.930778697Z" level=info msg="Image status: &ImageStatusResponse{Image:&Image{Id:a4ca41631cc7ac19ce1be3ebf0314ac5f47af7c711f17066006db82ee3b75b03,RepoTags:[k8s.gcr.io/coredns/coredns:v1.8.6],RepoDigests:[k8s.gcr.io/coredns/coredns@sha256:5b6ec0d6de9baaf3e92d0f66cd96a25b9edbce8716f5f15dcd1a616b3abd590e k8s.gcr.io/coredns/coredns@sha256:8916c89e1538ea3941b58847e448a2c6d940c01b8e716b20423d2d8b189d3972],Size_:46959895,Uid:nil,Username:,Spec:nil,},Info:map[string]string{},}" id=b995fb6f-8820-4d3e-995a-87498db9e66a name=/runtime.v1.ImageService/ImageStatus
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.931419905Z" level=info msg="Creating container: kube-system/coredns-8554ccb6dd-tzdcj/coredns" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.931482393Z" level=warning msg="Allowed annotations are specified for workload [] "
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.931494869Z" level=warning msg="Allowed annotations are specified for workload []"
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.948575637Z" level=warning msg="Failed to open /etc/passwd: open /var/lib/containers/storage/overlay/83c9c9ec5684fa2fd1c943d300507751866efb0be5ae15652cc6a97d2be47571/merged/etc/passwd: no such file or directory"
Jan 03 10:41:08 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:08.948617867Z" level=warning msg="Failed to open /etc/group: open /var/lib/containers/storage/overlay/83c9c9ec5684fa2fd1c943d300507751866efb0be5ae15652cc6a97d2be47571/merged/etc/group: no such file or directory"
Jan 03 10:41:09 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:09.007934302Z" level=error msg="Container creation error: time=\"2022-01-03T10:41:09Z\" level=error msg=\"container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: open /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2f5254d9_5b91_4987_8ea9_ddf323e3623b.slice/crio-4894546f2ac322dd116a01ae0da1c05cae1b1e079f8552ea5a3f84ef9a3fa816.scope/memory.memsw.limit_in_bytes: no such file or directory\"\n" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer
Jan 03 10:41:09 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:09.015875129Z" level=info msg="createCtr: deleting container ID 4894546f2ac322dd116a01ae0da1c05cae1b1e079f8552ea5a3f84ef9a3fa816 from idIndex" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer
Jan 03 10:41:09 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:09.016122205Z" level=info msg="createCtr: deleting container ID 4894546f2ac322dd116a01ae0da1c05cae1b1e079f8552ea5a3f84ef9a3fa816 from idIndex" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer
Jan 03 10:41:09 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:09.016293899Z" level=info msg="createCtr: deleting container ID 4894546f2ac322dd116a01ae0da1c05cae1b1e079f8552ea5a3f84ef9a3fa816 from idIndex" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer
Jan 03 10:41:09 worker0-k8s-mgmt crio[731]: time="2022-01-03 10:41:09.038543960Z" level=info msg="createCtr: deleting container ID 4894546f2ac322dd116a01ae0da1c05cae1b1e079f8552ea5a3f84ef9a3fa816 from idIndex" id=68474b21-6263-43c1-a002-50998549538d name=/runtime.v1.RuntimeService/CreateContainer

Steps to reproduce the issue:

  1. use kubeadm to upgrade the cluster
  2. upgrade cri-o from suse/libcontainers repo from 1.22 to 1.23
  3. schedule pods with resource limits onto upgraded node - they fail to start
  4. downgrade cri-o back to 1.22
  5. pods start

Describe the results you received:

Pods with resource limits/requests set fail to start.

Describe the results you expected:

Pods should start.

Additional information you deem important (e.g. issue happens only occasionally):

I note that when running crio manually, I see the following logs:

WARN[2022-01-03 10:59:48.273861323Z] node configuration validation for memoryswap cgroup failed: node not configured with memory swap
INFO[2022-01-03 10:59:48.273886219Z] Node configuration value for memoryswap cgroup is false
INFO[2022-01-03 10:59:48.273900456Z] Node configuration value for cgroup v2 is false

This feels somewhat relevant because the reason the pod cannot start is the lack of memory.memsw.limit_in_bytes - as I understand it this is related to swap (which is disabled). I'm also puzzled by the log about cgroupv2 configuration being false - crio is configured to use systemd as the cgroup manager and systemd is using cgroupv2.

Downgrading cri-o to 1.22 allows pods to start as normal.

Output of crio --version:

crio version 1.23.0
Version:          1.23.0
GitCommit:        9b7f5ae815c22a1d754abfbc2890d8d4c10e240d
GitTreeState:     clean
BuildDate:        2021-12-21T21:40:34Z
GoVersion:        go1.17.5
Compiler:         gc
Platform:         linux/amd64
Linkmode:         dynamic
BuildTags:        apparmor, exclude_graphdriver_devicemapper, containers_image_ostree_stub, seccomp
SeccompEnabled:   true
AppArmorEnabled:  true

Additional environment details (AWS, VirtualBox, physical, etc.):

  • Ubuntu 2004
  • kernel 5.4.0
  • cgroupv2 in use by systemd
  • systemd 245

I'm unsure if it's related, but containers-common was also upgraded at the same time from 1-21 to 1-22.

@haircommander
Copy link
Member

that is bizarre. @giuseppe do you have any insight? we use the podman IsCgroup2UnifiedMode to verify we're on cgroupv2.

@glitchcrab can you run a sanity check stat -f -c%T /sys/fs/cgroup? that is basically the same check the podman function is making

@glitchcrab
Copy link
Author

@haircommander sure, this is the output:

root@master1-k8s-mgmt:~# stat -f -c%T /sys/fs/cgroup
tmpfs

@haircommander
Copy link
Member

interesting, where is your systemd cgroup mounted at? and what is the output of stat -f -c%T $path of that path

@haircommander
Copy link
Member

oh and what kernel args did you use to set cgroupv2?

@giuseppe
Copy link
Member

giuseppe commented Jan 5, 2022

the system is not using the cgroupv2 unified hierarchy. With cgroupv2 the output should look like:

$ stat -f -c%T /sys/fs/cgroup
cgroup2fs

Please make sure to pass systemd.unified_cgroup_hierarchy=1 on the kernel command line

@glitchcrab
Copy link
Author

glitchcrab commented Jan 6, 2022

apologies, i didn't realise that i hadn't configured the machines properly. I've now done that:

shw@worker2-k8s-mgmt:~$ stat -f -c%T /sys/fs/cgroup
cgroup2fs

And the same issue occurs, albeit with a slightly different error:

(combined from similar events): Error: container create failed: time="2022-01-06T16:39:11Z" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: open /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod4a2781dc_e5e4_470d_94e9_322b094f3584.slice/crio-8b6f94d44527dc6136607fc6878139c01157839a23054bf5d17dfd40ffc64b57.scope/memory.swap.max: no such file or directory"

this is with cri-o 1.23 installed again. node configuration remains the same as mentioned in my original issue details, and this only affects pods with requests/limits set. my coredns deployment is configured like this:

        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi

@haircommander
Copy link
Member

can we have the crio logs now? it seems crio is trying to set swap for the container (maybe swap is on) but the memsw cgroup isn't configured. We should gracefully handle this situation but clearly aren't

@glitchcrab
Copy link
Author

These are the debug logs from about 1-2s before I attempted to schedule a pod on that node: crio-debug.log

@nemat-rakhmatov
Copy link

Though the issue opened for upgrading to 1.23.1 the same issue arises when creating the cluster from scratch. I'm new to the kubernetes world and learning how to create the cluster following the guides on kubernetes.io. I was struggling with creating a cluster on my test lab with 1.23 until I've found this issue and using the version 1.22 solved my issue. First it was related to the cgroups configuration on stock ubuntu 20.04.3 and later to swap memory as stated above by topicstarter

@haircommander
Copy link
Member

These are the debug logs from about 1-2s before I attempted to schedule a pod on that node: crio-debug.log

sorry should have been more clear. I'll need the logs from the beginning of the cri-o run (specifically, I want to see if we're correctly seeing that memsw is not setup)

@haircommander
Copy link
Member

Though the issue opened for upgrading to 1.23.1 the same issue arises when creating the cluster from scratch. I'm new to the kubernetes world and learning how to create the cluster following the guides on kubernetes.io. I was struggling with creating a cluster on my test lab with 1.23 until I've found this issue and using the version 1.22 solved my issue. First it was related to the cgroups configuration on stock ubuntu 20.04.3 and later to swap memory as stated above by topicstarter

cgroupv1 or cgroupv2?

@nemat-rakhmatov
Copy link

I guess it is cgroupv2, I've set the systemd.unified_cgroup_hierarchy=1 in grub and it showed cgroup2fs for the command: stat -f -c%T /sys/fs/cgroup
But after came the swap memory error "....scope/memory.swap.max not found"
And I decided to go with 1.22

@haircommander
Copy link
Member

I guess it is cgroupv2, I've set the systemd.unified_cgroup_hierarchy=1 in grub and it showed cgroup2fs for the command: stat -f -c%T /sys/fs/cgroup But after came the swap memory error "....scope/memory.swap.max not found" And I decided to go with 1.22

yeah that sounds like cgroupv2. I am guessing CRI-O reports a log line like node configuration validation for memoryswap cgroup failed: node not configured with memory swap Would you mind adding swapaccount=1 to the kernel cmdline and trying 1.23 again? I'm also in conversation with @ehashman (author of kubernetes swap work) about what to do in this situation

@haircommander
Copy link
Member

also just to check, is swap enabled?
swapon -s

@nemat-rakhmatov
Copy link

also just to check, is swap enabled? swapon -s

I've disabled swap as a first step for kubernetes preparation. ;)

@nemat-rakhmatov
Copy link

I may try again on in a few days. I'll let you know the outcome.

@nemat-rakhmatov
Copy link

nemat-rakhmatov commented Jan 8, 2022

Ok, it was a quick test :)

kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-5jzt2 0/1 CreateContainerError 0 31s
kube-system coredns-64897985d-5tmnw 0/1 CreateContainerError 0 31s
kube-system etcd-k1-123-test 1/1 Running 0 48s
kube-system kube-apiserver-k1-123-test 1/1 Running 0 48s
kube-system kube-controller-manager-k1-123-test 1/1 Running 0 41s
kube-system kube-proxy-ngbxh 1/1 Running 0 31s
kube-system kube-scheduler-k1-123-test 1/1 Running 0 48s

And the relevant logs are:

Jan 8 14:13:03 k1-123-test kubelet[6663]: E0108 14:13:03.754455 6663 kuberuntime_manager.go:918] container &Container{Name:coredns,Image:k8s.gcr.io/coredns/coredns:v1.8.6,Command:[],Args:[-conf /etc/coredns/Corefile],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:dns,HostPort:0,ContainerPort:53,Protocol:UDP,HostIP:,},ContainerPort{Name:dns-tcp,HostPort:0,ContainerPort:53,Protocol:TCP,HostIP:,},ContainerPort{Name:metrics,HostPort:0,ContainerPort:9153,Protocol:TCP,HostIP:,},},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{memory: {{178257920 0} {} 170Mi BinarySI},},Requests:ResourceList{cpu: {{100 -3} {} 100m DecimalSI},memory: {{73400320 0} {} 70Mi BinarySI},},},VolumeMounts:[]VolumeMount{VolumeMount{Name:config-volume,ReadOnly:true,MountPath:/etc/coredns,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:kube-api-access-89hr5,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:&Probe{ProbeHandler:ProbeHandler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/health,Port:{0 8080 },Host:,Scheme:HTTP,HTTPHeaders:[]HTTPHeader{},},TCPSocket:nil,GRPC:nil,},InitialDelaySeconds:60,TimeoutSeconds:5,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:5,TerminationGracePeriodSeconds:nil,},ReadinessProbe:&Probe{ProbeHandler:ProbeHandler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/ready,Port:{0 8181 },Host:,Scheme:HTTP,HTTPHeaders:[]HTTPHeader{},},TCPSocket:nil,GRPC:nil,},InitialDelaySeconds:0,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,TerminationGracePeriodSeconds:nil,},Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[NET_BIND_SERVICE],Drop:[all],},Privileged:nil,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:*true,AllowPrivilegeEscalation:*false,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod coredns-64897985d-5tmnw_kube-system(16bde1e1-471f-48ae-a5f4-90394b5f849b): CreateContainerError: container create failed: time="2022-01-08T14:13:03Z" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: open /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod16bde1e1_471f_48ae_a5f4_90394b5f849b.slice/crio-b63f599f0e1ba047ed424acae699456ffbf8f245836364f8cb6ebf602701b037.scope/memory.memsw.limit_in_bytes: no such file or directory"

@glitchcrab
Copy link
Author

These are the debug logs from about 1-2s before I attempted to schedule a pod on that node: crio-debug.log

sorry should have been more clear. I'll need the logs from the beginning of the cri-o run (specifically, I want to see if we're correctly seeing that memsw is not setup)

this debug log is from the node booting with debug logging enabled: debug.log

@haircommander
Copy link
Member

@glitchcrab what's the output of swapon -s and cat /proc/cmdline

@glitchcrab
Copy link
Author

@glitchcrab what's the output of swapon -s and cat /proc/cmdline

shw@worker2-k8s-mgmt:~$ sudo swapon -s
shw@worker2-k8s-mgmt:~$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.4.0-92-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro systemd.unified_cgroup_hierarchy=1 autoinstall ds=nocloud-net

@haircommander
Copy link
Member

haircommander commented Jan 11, 2022

oopsies, this is definitely just a bug in cri-o: #5539 (we used to do this but accidentally dropped it when swap support was added)

memes added a commit to memes/lab-config that referenced this issue Jan 28, 2022
@mozTheFuzz
Copy link

I am building a kubernetes cluster for the first time with the same configuration and was baffled by the error for several days

Extremely lucky to find this post

@haircommander
Copy link
Member

fix is merged in main branch, I'm backporting to 1.23 and intend on cutting a 1.23.1 soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants