Skip to content

unable to start container process: error during init: error setting cgroup config for procHooks process: cpu.max: no such file or directory: unknowncontainer #9651

@fatmanurozdemir

Description

@fatmanurozdemir

Description

I initiated a Kubernetes cluster on my master node and aimed to incorporate the Jetson AGX Orin as an edge node. Using keadm join, I successfully added the Orin device to the cluster, and it appears as 'Ready' from the master node. However, attempting to deploy a basic application onto Orin using kubectl from the master node results in a 'CrashLoopBackOff' status.

When I describe the pod for debugging:
Containers:
nginx:
Container ID: containerd://a57da990b1ffe2f3aa1445311d57ed0a99e703d81a496a1bb519f244ce1ce03f
Image: nginx:1.14.2
Image ID: docker.io/library/nginx@sha256:f7988fb6c02e0ce69257d9bd9cf37ae20a60f1df7563c3a2a6abe24160306b8d
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: StartError
Message: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: openat2 /sys/fs/cgroup/system.slice/kubepods-besteffort-pod7654d855_494c_47c2_9ff6_f107799958dc.slice:cri-containerd:6b6fc5b2451465631739e98a758a21aa941d10040d36db93c7a5c892f0080ed7/cpu.max: no such file or directory: unknown

There are many cpu files in this path but cpu.max is missing:
cpu max_issue

The attempts to resolve the problem:

  1. The default containerd version was 1.6.27. I installed 1.6.12 and 1.7.12 versions but it did not resolve the issue.
  2. I installed keadm v1.14.3, 1.14.4 and 1.15.1 on the Orin but all of them gave the same results.
  3. I am not so familar with cgroup drivers. So I am not sure if the issue is related to cgroup driver on the Orin device because I also added Jetson Nano as an edge device to my cluster and did not come across this issue. (both of them had 'cgoupfs' driver is their edgecore.yaml files). Anyway, I changed Orin cgroup driver from cgroupfs to systemd in the edgecore.yaml file as suggested in https://kubeedge.io/docs/faq/setup/ 'cgroup driver does not match' issue.
    None of the above adjustments resolved the issue.

Steps to reproduce the issue

Describe the results you received and expected

I received pod 'CrashLoopBackOff' error.
I expect to run the pod successfully and get 'Running' status.

What version of containerd are you using?

v1.7.12

Any other relevant information

Jetson Orin AGX
Kernel Version: 5.15.122-tegra
OS Image: Ubuntu 22.04.3 LTS
Operating System: linux
Architecture: aarch64

Show configuration if it is related to CRI plugin.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions