Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962

Open
invidian opened this issue Dec 10, 2021 · 13 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@invidian
Copy link
Member

invidian commented Dec 10, 2021

What happened?

It seems with specific configuration of /run mounts specified in reproduction steps below, you can reach situation, where mounts on host file-system will double every time Pod restarts, which eventually leads to general slowness and CPU exhaustion.

What did you expect to happen?

Created mounts to be cleaned up properly when Pod container gets recreated.

How can we reproduce it (as minimally and precisely as possible)?

  1. Create minikube cluster:
minikube start
  1. Create offending pod:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: invidian-var-run-symlink-bug
spec:
  containers:
  - command:
    - "false"
    # This image must have "ln -s /run /var/run" executed to trigger the bug.
    image: invidian/buxybox:with-var-run-symlink
    name: invidian-var-run-symlink-bug
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /run
      mountPropagation: Bidirectional
      name: run
  volumes:
  - hostPath:
      path: /run
    name: run
EOF
  1. Log in into minikube machine:
minikube ssh
  1. Check number of duplicate mounts on host namespace. Printed number of duplicate mounts will be growing over time, which is subject of this bug. Expected would be to see fixed number of mounts until pod is removed.
cat /proc/mounts | awk '{print $2}' | sort | uniq -c | sort -k1 -n | grep -v -E '( 1 |uuid)'

Example output from the reproduction command:

$ cat /proc/mounts | awk '{print $2}' | sort | uniq -c | sort -k1 -n | grep -v -E '( 1 |uuid)'
  16384 /var/lib/kubelet/pods/8528db9a-69c0-4c15-a58e-2cd343ff5b24/volumes/kubernetes.io~secret/default-token-xk9lc
  16389 /run/secrets/kubernetes.io/serviceaccount

And Dockerfile for used image:

FROM busybox

RUN ln -s /run /var/run

It seems removing the symlink fixes the issue.

Anything else we need to know?

Given that this occurs using different container runtimes and different kernels, I suspect this is a kubelet bug.

CC @alban

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"archive", BuildDate:"2021-12-09T17:56:21Z", GoVersion:"go1.17.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.3", GitCommit:"c92036820499fedefec0f847e2054d824aea6cd1", GitTreeState:"clean", BuildDate:"2021-10-27T18:35:25Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"}

# tested also with
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.16", GitCommit:"46bb3f4471faba8c6a59ea27810ed4c425e44aec", GitTreeState:"clean", BuildDate:"2021-03-10T23:35:58Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic$ uname -a
Linux test 5.4.0-1063-azure #66~18.04.1-Ubuntu SMP Thu Oct 21 09:59:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# Tested also on:
$ cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=2983.2.1
VERSION_ID=2983.2.1
BUILD_ID=2021-11-19-2019
PRETTY_NAME="Flatcar Container Linux by Kinvolk 2983.2.1 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar-linux.org/"
BUG_REPORT_URL="https://issues.flatcar-linux.org"
FLATCAR_BOARD="amd64-usr"
$ uname -a
Linux test 5.10.80-flatcar #1 SMP Fri Nov 19 19:28:59 -00 2021 x86_64 Intel Xeon Processor (Skylake, IBRS) GenuineIntel GNU/Linux

Install tools

$ minikube version
minikube version: v1.24.0
commit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b-dirty

Container runtime (CRI) and and version (if applicable)

$ sudo docker version
Client: Docker Engine - Community
 Version:           20.10.10
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        b485636
 Built:             Mon Oct 25 07:42:57 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:52:16 2021
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

# Tested also with containerd as runtime
$ containerd --version
containerd github.com/containerd/containerd 1.5.8 cde01e96ed658bc5050abe1bb601b4b4510ba7a2

Related plugins (CNI, CSI, ...) and versions (if applicable)

@invidian invidian added the kind/bug Categorizes issue or PR as related to a bug. label Dec 10, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 10, 2021
@invidian
Copy link
Member Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 10, 2021
@ehashman ehashman added this to Triage in SIG Node Bugs Dec 10, 2021
@SergeyKanzhelev
Copy link
Member

the description says: "Created mounts to be cleaned up properly when Pod container gets recreated.' but repro steps doesn't have any "recreation" step. Can you please clarify?

/triage needs-information

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Dec 15, 2021
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Needs Information in SIG Node Bugs Dec 15, 2021
@invidian
Copy link
Member Author

invidian commented Dec 15, 2021

the description says: "Created mounts to be cleaned up properly when Pod container gets recreated.' but repro steps doesn't have any "recreation" step. Can you please clarify?

What do you mean there is no recreation step? If you follow the reproduction steps, last step will show you growing number of mounts over time. There is also an example output in the next section. I'm not sure what needs to be clarified.

EDIT: I've moved the example output and clarified on last step to be more explicit about the problem.

invidian added a commit to inspektor-gadget/inspektor-gadget that referenced this issue Jan 3, 2022
Due to Kubernetes bug
kubernetes/kubernetes#106962, having a
/var/run symlink to /run in a container image may lead to node resource
exhaution. To mitigate it, we plan to remove this symlink. However, when
we do that, /var/run/docker.sock path will no longer be valid.

To make it work and to align all container runtime socket paths, let's
change default Docker socket path from /var/run to just /run, which
should work on most modern distributions.

See also #433.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit to inspektor-gadget/inspektor-gadget that referenced this issue Jan 5, 2022
Due to Kubernetes bug
kubernetes/kubernetes#106962, having a
/var/run symlink to /run in a container image may lead to node resource
exhaution. To mitigate it, we plan to remove this symlink. However, when
we do that, /var/run/docker.sock path will no longer be valid.

To make it work and to align all container runtime socket paths, let's
change default Docker socket path from /var/run to just /run, which
should work on most modern distributions.

See also #433.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2022
@invidian
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2022
@invidian
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2022
@matthyx
Copy link
Contributor

matthyx commented Aug 29, 2022

/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 29, 2022
@matthyx matthyx moved this from Needs Information to Triaged in SIG Node Bugs Aug 29, 2022
@matthyx
Copy link
Contributor

matthyx commented Aug 29, 2022

/remove-triage needs-information

@k8s-ci-robot k8s-ci-robot removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Aug 29, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 27, 2022
@invidian
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 27, 2022
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
Development

No branches or pull requests

5 participants