-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make container shm parent unbindable #35830
Conversation
f04418d
to
14d75c5
Compare
I tried to test it, but I can't reproduce the "shm: device or resource busy" error on latest moby/moby even without this patch. My reproducer is something like [root@kir-ce73-gd 34672]# cat repro21.sh
#!/bin/bash
set -e
set -u
set -o pipefail
TOTAL_CTS=300 # this depends on RAM, about 100 CT per 1GB
IMG=busybox
CMD="sleep 0.2"
#CLI="./docker-1706"
CLI="docker"
RUN="$CLI run --rm --net=none $IMG $CMD"
# Make it fail even on RHEL 7.4
sysctl fs.may_detach_mounts=0
# Show which docker-cli we're using
echo RUN=$RUN
# Show client and server versions
$CLI version
echo
# load image, preheat
$RUN
time for i in $(seq 1 "$TOTAL_CTS"); do
printf "Iter: %5d Jobs: %5d \r" "$i" "$(jobs -p | wc -l)"
$RUN && echo -n . || echo -e \nERROR $? &
done
echo "Waiting..."
time wait On older dockerd I am getting the error, but on moby/moby compiled from current git I do not. |
Yes, this is why for the test I am actually checking the mount table in the container. |
The main problem with unmounts that prevent ebusy is the resources are never cleaned up until everything is released in all namespaces... Which could be a very long time. |
c998b30
to
7f1cb0f
Compare
712767a
to
411d0c7
Compare
daemon/container_operations_unix.go
Outdated
@@ -307,8 +327,13 @@ func (daemon *Daemon) setupConfigDir(c *container.Container) (setupErr error) { | |||
if err := os.Chown(fPath, rootIDs.UID+uid, rootIDs.GID+gid); err != nil { | |||
return errors.Wrap(err, "error setting ownership for config") | |||
} | |||
} | |||
|
|||
label.Relabel(localMountPath, c.MountLabel, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to verify that this change actually works. It's the same as secrets, but could be the implementation is different further down the stack.
ba45ff5
to
d7b15ad
Compare
This is ready to go. |
At this point, commit message does not reflect what the patch does; can you please amend it? |
d7b15ad
to
f16b37f
Compare
Yep, thanks for pointing that out. Updated. |
It's a common scenario for admins and/or monitoring applications to mount in the daemon root dir into a container. When doing so all mounts get coppied into the container, often with private references. This can prevent removal of a container due to the various mounts that must be configured before a container is started (for example, for shared /dev/shm, or secrets) being leaked into another namespace, usually with private references. This is particularly problematic on older kernels (e.g. RHEL < 7.4) where a mount may be active in another namespace and attempting to remove a mountpoint which is active in another namespace fails. This change moves all container resource mounts into a common directory so that the directory can be made unbindable. What this does is prevents sub-mounts of this new directory from leaking into other namespaces when mounted with `rbind`... which is how all binds are handled for containers. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
f16b37f
to
eaa5192
Compare
Ping |
SGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
LGTM |
This prevents the container shm/secrets mount from leaking into other
containers, particularly when the daemon root is mounted into other
containers.
Kind of hackily implemented now since there's some assumption on the location of the shm mount path between packages...