Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SystemOOMs not reported for containers #88868

Closed
dashpole opened this issue Mar 5, 2020 · 1 comment · Fixed by #88871
Closed

SystemOOMs not reported for containers #88868

dashpole opened this issue Mar 5, 2020 · 1 comment · Fixed by #88871
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@dashpole
Copy link
Contributor

dashpole commented Mar 5, 2020

What happened:

A SystemOOM event was not reported even when the memory limit hit was the root cgroup memory limit.

What you expected to happen:

A SystemOOM event should have been reported.

How to reproduce it (as minimally and precisely as possible):

Do not set --system-reserved or --kube-reserved, and set --eviction-hard= to disable memory eviction on the kubelet. Run a pod without memory limits which slowly consumes memory, such as with k8s.gcr.io/stress:v1 from https://github.com/vishh/stress. Use arguments -mem-alloc-size", "100Mi", "-mem-alloc-sleep", "10s". This should trigger an OOM on the root cgroup, but no event will be generated.

Anything else we need to know?:

cAdvisor populates ContainerName and VictimContainerName from matching the regexp: Task in (.*) killed as a result of limit of (.*). A SystemOOM should mean VictimContainerName == "/", as we are looking for OOMs that are "killed as a results of limit of /". However, we incorrectly check ContainerName instead in the kubelet.

cc @kubernetes/sig-node-bugs @derekwaynecarr @sjenning @dchen1107

@dashpole dashpole added the kind/bug Categorizes issue or PR as related to a bug. label Mar 5, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Mar 5, 2020
@dashpole
Copy link
Contributor Author

dashpole commented Mar 5, 2020

/sig node
/priority important-soon

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants