-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerd fails to detect OOM events in kernel 4.19+ #74
Comments
/cc @crosbymichael |
Hi, This is a kernel bug introduced in v4.19, but not from the commits Farrukh referenced, but https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29ef680a -- this broke cgroupvs-v1 eventfd notification completely for OOM events. I have notified the appropriate kernel maintainers. |
Thanks @holzman - appreciated! |
A patch to fix this was accepted last week: torvalds/linux@7056d3a Looks like it'll make it into the next release. Thank you @holzman ! |
This has been fixed in version 4.20.2-1. The issue can be closed. Thanks! |
As reported in moby/moby#38352 by @Farrukh-Aftab (thanks!), containerd fails to detect OOM events on kernel 4.19, likely related to the following changes in the kernel; I'll copy the information here;
Description
Hello,
Apologies if the reported bug is a duplicate of another issue. I tried searching through the issues but didn't find anything similar.
There were some improvements made to OOM killer in kernel 4.19 to make it more 'cgroup aware'. You can find the relevant commits under [1] in the additional information section. After the change went in, Docker isn't setting the
.State.OOMKilled
flag correctly despite OOM killer being invoked.To reproduce this, I have created a sample image named
fakhan/sl7:oom-test
. The entrypoint of the image is a program that consumes around 1024MB of memory. Creating this container with anything lower should trigger OOM killer. I have provided more information below on the bug.BUG REPORT INFORMATION shown below:
Steps to reproduce the issue:
I have posted the snippets from
/var/log/messages
for both these instances under the additional information section.Describe the results you received:
As shown above, OOMKilled evaluates to 'False' on kernel 4.19
Describe the results you expected:
OOMKilled should have evaluated to 'True' on kernel 4.19 just like the on the previous versions
Additional information you deem important (e.g. issue happens only occasionally):
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dc0b58643aff8b378086f25cce6789ccba68cbcb
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5989ad7b5ede38d605c588981f634c08252abfc3
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3d8b38eb81cac81395f6a823f6bf401b327268e6
[2]
/var/log/messages
snippets are as follows. First for the older kernelThen for the newer kernel
Output of
docker version
:Docker version on both machines is the same (18.06)
Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
Nothing special about the environment. I am running these commands on two bare metal boxes with different kernels.
The text was updated successfully, but these errors were encountered: