Built-in K3s Containerd doesn't report OOM events for cgroups v2 #4572

ghost · 2021-11-24T13:36:51Z

Environmental Info:
K3s Version:
k3s version v1.21.6+k3s1 (df033fa)
go version go1.16.8

Node(s) CPU architecture, OS, and Version:
Linux hostname 5.11.0-1017-aws #18~20.04.1-Ubuntu SMP Fri Aug 27 11:21:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
K3s v1.21.6+k3s1 cluster with 3 servers and 5 agents all servers are using Linux cgroups v2

Describe the bug:
When a process inside a pod is killed due to OOM Containerd doesn't report OOM events. It affects only systems which are using cgroups v2 with v1 it works as expected.

Steps To Reproduce:

Installed K3s:
k3s-agent systemd service:

Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=exec
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s-agent.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
    agent \
        '-c' \
        '/etc/rancher/k3s/config.yaml' \
        '--server' \
        'https://master:6443' \

/etc/rancher/k3s/config.yaml:

no-flannel: true
node-name: HOSTNAME
kubelet-arg:
- eviction-hard=imagefs.available<5%,nodefs.available<5%,memory.available<5%
- eviction-soft=imagefs.available<10%,nodefs.available<10%,memory.available<10%
- eviction-soft-grace-period=imagefs.available=5m,nodefs.available=5m,memory.available=5m
- cloud-provider=external
- "provider-id=aws:///us-east-1b/i-111111111"

node-label:
- "group-name=worker-group"
- "node-type=worker"

Create a deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stress-oom-crasher
  labels:
    app: stress-oom-crasher
spec:
  replicas: 1
  selector:
    matchLabels:
      app: stress-oom-crasher
  template:
    metadata:
      labels:
        app: stress-oom-crasher
    spec:
      containers:
      - name: stress-oom-tester
        image: python:3.9.9
        command: ["/bin/sleep", "3650d"]
        resources:
          limits:
            memory: "123Mi"
            cpu: 50m
          requests:
            memory: "123Mi"
            cpu: 50m

Inside stress-oom-crasher pod execute python code below in order to cause OOM event

l = []
new_list4k = [0]*4096
while True:
    l.extend((new_list4k))

Expected behavior:
On node where stress-oom-crasher pod is running ctr events should show OOM events, e.g.

2021-11-24 10:59:04.757581973 +0000 UTC k8s.io /tasks/oom {"container_id":"3166ec37d31ee3089e272d6f3261585786fdcdc41d3cda4a3aac3ebd2b324586"}
2021-11-24 10:59:04.75831734 +0000 UTC k8s.io /tasks/oom {"container_id":"75c684a3665b008f1037324c7511150fe6cfad0b14d79d5030fda0130c59478f"}

Actual behavior:
There are no OOM events in output of ctr events command

Additional context / logs:
I noticed that if run a container manually, e.g.
ctr run -t --memory-limit=126000000 docker.io/library/python:3.9.9 test_oom bash
and generate OOM then expected /tasks/oom event is shown in output of ctr events.
In this case corresponding cgroup is created under /sys/fs/cgroup/k8s.io/ in case if a container is created by k3s corresponding cgroup is created under /sys/fs/cgroup/kubepods/.

Backporting

Needs backporting to older releases

The text was updated successfully, but these errors were encountered:

brandond · 2021-11-24T17:37:58Z

Have you tested to see if this behavior is unique to our packaging of containerd? Can you reproduce the same behavior with upstream containerd 1.4 when using cgroupv2?

stale · 2022-05-23T18:28:48Z

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

jepio mentioned this issue Dec 3, 2021

cgroup2: monitor OOMKill instead of OOM to prevent missing container events containerd/containerd#6323

Merged

stale bot added the status/stale label May 23, 2022

stale bot closed this as completed Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Built-in K3s Containerd doesn't report OOM events for cgroups v2 #4572

Built-in K3s Containerd doesn't report OOM events for cgroups v2 #4572

ghost commented Nov 24, 2021

brandond commented Nov 24, 2021

stale bot commented May 23, 2022

Built-in K3s Containerd doesn't report OOM events for cgroups v2 #4572

Built-in K3s Containerd doesn't report OOM events for cgroups v2 #4572

Comments

ghost commented Nov 24, 2021

brandond commented Nov 24, 2021

stale bot commented May 23, 2022