/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

invidian · 2021-12-10T18:27:22Z

What happened?

It seems with specific configuration of /run mounts specified in reproduction steps below, you can reach situation, where mounts on host file-system will double every time Pod restarts, which eventually leads to general slowness and CPU exhaustion.

What did you expect to happen?

Created mounts to be cleaned up properly when Pod container gets recreated.

How can we reproduce it (as minimally and precisely as possible)?

Create minikube cluster:

minikube start

Create offending pod:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: invidian-var-run-symlink-bug
spec:
  containers:
  - command:
    - "false"
    # This image must have "ln -s /run /var/run" executed to trigger the bug.
    image: invidian/buxybox:with-var-run-symlink
    name: invidian-var-run-symlink-bug
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /run
      mountPropagation: Bidirectional
      name: run
  volumes:
  - hostPath:
      path: /run
    name: run
EOF

Log in into minikube machine:

minikube ssh

Check number of duplicate mounts on host namespace. Printed number of duplicate mounts will be growing over time, which is subject of this bug. Expected would be to see fixed number of mounts until pod is removed.

cat /proc/mounts | awk '{print $2}' | sort | uniq -c | sort -k1 -n | grep -v -E '( 1 |uuid)'

Example output from the reproduction command:

$ cat /proc/mounts | awk '{print $2}' | sort | uniq -c | sort -k1 -n | grep -v -E '( 1 |uuid)'
  16384 /var/lib/kubelet/pods/8528db9a-69c0-4c15-a58e-2cd343ff5b24/volumes/kubernetes.io~secret/default-token-xk9lc
  16389 /run/secrets/kubernetes.io/serviceaccount

And Dockerfile for used image:

FROM busybox

RUN ln -s /run /var/run

It seems removing the symlink fixes the issue.

Anything else we need to know?

Given that this occurs using different container runtimes and different kernels, I suspect this is a kubelet bug.

CC @alban

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"archive", BuildDate:"2021-12-09T17:56:21Z", GoVersion:"go1.17.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.3", GitCommit:"c92036820499fedefec0f847e2054d824aea6cd1", GitTreeState:"clean", BuildDate:"2021-10-27T18:35:25Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"}

# tested also with
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.16", GitCommit:"46bb3f4471faba8c6a59ea27810ed4c425e44aec", GitTreeState:"clean", BuildDate:"2021-03-10T23:35:58Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic$ uname -a
Linux test 5.4.0-1063-azure #66~18.04.1-Ubuntu SMP Thu Oct 21 09:59:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# Tested also on:
$ cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=2983.2.1
VERSION_ID=2983.2.1
BUILD_ID=2021-11-19-2019
PRETTY_NAME="Flatcar Container Linux by Kinvolk 2983.2.1 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar-linux.org/"
BUG_REPORT_URL="https://issues.flatcar-linux.org"
FLATCAR_BOARD="amd64-usr"
$ uname -a
Linux test 5.10.80-flatcar #1 SMP Fri Nov 19 19:28:59 -00 2021 x86_64 Intel Xeon Processor (Skylake, IBRS) GenuineIntel GNU/Linux

Install tools

$ minikube version
minikube version: v1.24.0
commit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b-dirty

Container runtime (CRI) and and version (if applicable)

$ sudo docker version
Client: Docker Engine - Community
 Version:           20.10.10
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        b485636
 Built:             Mon Oct 25 07:42:57 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:52:16 2021
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

# Tested also with containerd as runtime
$ containerd --version
containerd github.com/containerd/containerd 1.5.8 cde01e96ed658bc5050abe1bb601b4b4510ba7a2

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

invidian · 2021-12-10T18:28:46Z

/sig node

SergeyKanzhelev · 2021-12-15T18:43:41Z

the description says: "Created mounts to be cleaned up properly when Pod container gets recreated.' but repro steps doesn't have any "recreation" step. Can you please clarify?

/triage needs-information

invidian · 2021-12-15T18:47:30Z

the description says: "Created mounts to be cleaned up properly when Pod container gets recreated.' but repro steps doesn't have any "recreation" step. Can you please clarify?

What do you mean there is no recreation step? If you follow the reproduction steps, last step will show you growing number of mounts over time. There is also an example output in the next section. I'm not sure what needs to be clarified.

EDIT: I've moved the example output and clarified on last step to be more explicit about the problem.

Due to Kubernetes bug kubernetes/kubernetes#106962, having a /var/run symlink to /run in a container image may lead to node resource exhaution. To mitigate it, we plan to remove this symlink. However, when we do that, /var/run/docker.sock path will no longer be valid. To make it work and to align all container runtime socket paths, let's change default Docker socket path from /var/run to just /run, which should work on most modern distributions. See also #433. Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

k8s-triage-robot · 2022-03-15T19:19:20Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian · 2022-03-15T20:28:09Z

/remove-lifecycle stale

k8s-triage-robot · 2022-06-13T21:06:48Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian · 2022-06-13T21:08:49Z

/remove-lifecycle stale

matthyx · 2022-08-29T12:45:42Z

/triage accepted
/priority backlog

matthyx · 2022-08-29T12:46:43Z

/remove-triage needs-information

k8s-triage-robot · 2022-11-27T13:44:05Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian · 2022-11-27T16:09:42Z

/remove-lifecycle stale

k8s-triage-robot · 2024-01-19T10:58:50Z

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot · 2024-04-18T11:17:58Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian added the kind/bug Categorizes issue or PR as related to a bug. label Dec 10, 2021

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 10, 2021

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 10, 2021

invidian mentioned this issue Dec 10, 2021

Remove /var/run symlink from container image to mitigate Kubernetes bug inspektor-gadget/inspektor-gadget#433

Closed

ehashman added this to Triage in SIG Node Bugs Dec 10, 2021

k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Dec 15, 2021

SergeyKanzhelev moved this from Triage to Needs Information in SIG Node Bugs Dec 15, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2022

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 29, 2022

matthyx moved this from Needs Information to Triaged in SIG Node Bugs Aug 29, 2022

k8s-ci-robot removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Aug 29, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 27, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 27, 2022

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 19, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

invidian commented Dec 10, 2021 •

edited

invidian commented Dec 10, 2021

SergeyKanzhelev commented Dec 15, 2021

invidian commented Dec 15, 2021 •

edited

k8s-triage-robot commented Mar 15, 2022

invidian commented Mar 15, 2022

k8s-triage-robot commented Jun 13, 2022

invidian commented Jun 13, 2022

matthyx commented Aug 29, 2022

matthyx commented Aug 29, 2022

k8s-triage-robot commented Nov 27, 2022

invidian commented Nov 27, 2022

k8s-triage-robot commented Jan 19, 2024

k8s-triage-robot commented Apr 18, 2024

/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

Comments

invidian commented Dec 10, 2021 • edited

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

invidian commented Dec 10, 2021

SergeyKanzhelev commented Dec 15, 2021

invidian commented Dec 15, 2021 • edited

k8s-triage-robot commented Mar 15, 2022

invidian commented Mar 15, 2022

k8s-triage-robot commented Jun 13, 2022

invidian commented Jun 13, 2022

matthyx commented Aug 29, 2022

matthyx commented Aug 29, 2022

k8s-triage-robot commented Nov 27, 2022

invidian commented Nov 27, 2022

k8s-triage-robot commented Jan 19, 2024

k8s-triage-robot commented Apr 18, 2024

invidian commented Dec 10, 2021 •

edited

invidian commented Dec 15, 2021 •

edited