New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server silently fails to set memory cgroup, does not report to user or fail to #24559
Comments
So the kernel issue means that if you have ever created 64k containers with memory bounds, even if they are no longer running, the system will no longer be able to create containers (if it wasnt for the bug that means they are instead unconfined)? |
We found that using Docker 1.83 with kernel 4.2, 4.4, and 4.5, containers <64k will create normally with memory isolation; containers >64k will create, but with no isolation. We did not test newer versions. Our tests used CoreOS 835, CoreOS 1010, and Ubuntu 16.04. We've reported this to CoreOS and kernel.org for upstream attention. |
I tested kernel 4.6(Arch) and it fixes this, using the patch noted above. Still a valuable advisory for folks with unpatched kernels. I leave it to the project to decide if this should be closed. |
Also tracked here: https://bugzilla.kernel.org/show_bug.cgi?id=124641 |
The fix is also going to backport to 4.4. https://lkml.org/lkml/2016/7/13/864 I think this can be closed. |
Thanks so much for the detailed report and links, @jgarcia-mesosphere, really appreciated. I think it's okay to keep it open for a short time to make it easier to find, but we can close after that (and when we know the "mainstream" distros carry this patch) |
On Tue, Jul 12, 2016 at 03:16:40PM -0700, John Garcia wrote:
I'm interested in this part of issue. Can you reproduce this using the latest |
So I can't reproduce this issue on a
Fwiw, result is identical when no memory confinement is specified. Will try to reproduce on |
Thanks for the repro @brauner looks to me like everything is good in the new version based on that. |
@justincormack wdyt, should we have a look at that error (for unpatched kernels?) |
I also can't reproduce this with the latest Docker version on an unpatched
|
Yeah, all of the versions of runC that I can recall will hard fail if you ask it to set up a cgroup and then it can't set it up. It might've been some weirdness happening with |
@cyphar using the same procedure and kernel |
Thanks all! Based on the feedback above, I'm closing this; the underlying issue will be fixed in the kernel, and recent versions of Docker won't silently ignore the problem, so looks like we're good to go 👍 |
I believe 4.4.19 stable kernel has the fix so this is no longer an issue (finally). |
Environment details (AWS, VirtualBox, physical, etc.):
Using Docker 1.83 in CoreOS 835.6, we noticed that in conditions that make it impossible to set a memory cgroup:
Subsequently, Docker permits containers to be created without memory isolation (instead of failing), resulting in a container with unbounded memory capacity - note the cgroup structure created:
Steps to reproduce the issue:
-m
or--memory
option)/proc/cgroups
to verify there are 65335 memory cgroups-m
or--memory
option/sys/fs/cgroup/memory/system.slice/docker-*
folder as expectedDescribe the results you received:
We saw what appeared to be the docker daemon silently fail to set memory isolation.
Describe the results you expected:
We expected the daemon to refuse to run the container because isolation failed.
Additional information you deem important (e.g. issue happens only occasionally):
This is also tracked in MESOS-5836 and a patch has been suggested for the kernel at patch 9184539
The text was updated successfully, but these errors were encountered: