New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/var/run symlink to /run in container image with /run host bidirectional mount causes duplicate host mounts leading to CPU exhaustion #106962
Comments
/sig node |
the description says: "Created mounts to be cleaned up properly when Pod container gets recreated.' but repro steps doesn't have any "recreation" step. Can you please clarify? /triage needs-information |
What do you mean there is no recreation step? If you follow the reproduction steps, last step will show you growing number of mounts over time. There is also an example output in the next section. I'm not sure what needs to be clarified. EDIT: I've moved the example output and clarified on last step to be more explicit about the problem. |
Due to Kubernetes bug kubernetes/kubernetes#106962, having a /var/run symlink to /run in a container image may lead to node resource exhaution. To mitigate it, we plan to remove this symlink. However, when we do that, /var/run/docker.sock path will no longer be valid. To make it work and to align all container runtime socket paths, let's change default Docker socket path from /var/run to just /run, which should work on most modern distributions. See also #433. Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
Due to Kubernetes bug kubernetes/kubernetes#106962, having a /var/run symlink to /run in a container image may lead to node resource exhaution. To mitigate it, we plan to remove this symlink. However, when we do that, /var/run/docker.sock path will no longer be valid. To make it work and to align all container runtime socket paths, let's change default Docker socket path from /var/run to just /run, which should work on most modern distributions. See also #433. Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/triage accepted |
/remove-triage needs-information |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
What happened?
It seems with specific configuration of
/run
mounts specified in reproduction steps below, you can reach situation, where mounts on host file-system will double every time Pod restarts, which eventually leads to general slowness and CPU exhaustion.What did you expect to happen?
Created mounts to be cleaned up properly when Pod container gets recreated.
How can we reproduce it (as minimally and precisely as possible)?
Example output from the reproduction command:
And
Dockerfile
for used image:It seems removing the symlink fixes the issue.
Anything else we need to know?
Given that this occurs using different container runtimes and different kernels, I suspect this is a kubelet bug.
CC @alban
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: