-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker bails on systems with btrfs + SELinux w/o regard for SELinux status #7952
Comments
Note that issue #7318 seems to be related. The "fix" proposed there was to just disable SELinux. This doesn't fit the use case of folks who would like to use Docker in production (likely okay for the Docker *-dev versions they were using at the time). |
This section on file system support from the following article published on 9/3 may be helpful (although it doesn't fix your problem): https://opensource.com/business/14/9/security-for-docker File system support SELinux currently will only work with the device mapper back end. SELinux does not work with BTRFS. BTRFS does not support context mount labeling yet, which prevents SELinux from relabeling all content when the container starts via the mount command. Kernel engineers are working on a fix for this and potentially Overlayfs if it gets merged into the container. |
That's great info, thank you @estesp. I found the relevant bug on RHEL7's Bugzilla. It looks like Daniel Walsh has recently assigned someone to work on it, so there should be a fix coming from upstream. #6452 seems to be where the issue was introduced. I'm up for closing this, now that it's clear that it needs to be addressed upstream (wasn't clear to me until this morning) and it's linked to the primary bug report. |
It'd be awesome if we could document this issue a bit more clearly, maybe on the install pages for distros that support btrfs. I burned ~12 hours on it yesterday, so the "duplicate no effort" part of me is screaming to help others avoid that pain. |
Pull request #7956 includes doc updates for the issue described here. |
After updating to 1.2.0, I'm seeing #7709.
This is actually worse than the error at 1.1.2, as it fails without regard to the status of SELinux.
The pull request that introduced this behavior, #6452, claims that it is checking for the status of SELinux. Based on what I'm seeing, that doesn't appear to be the case. |
Now that #7956 and #7709 are closed, this is the correct place to talk about the SELinux + btrfs issue. As a recap, the 1.2.0 failure behavior is that Docker bails on btrfs systems without regard for the status of SELinux. The expected behavior is that if SELinux is in Permissive mode, Docker should continue. |
I believe that I'm hitting this bug as well. I'm running a fresh Fedora 20 install on btrfs. Are there _any_ workarounds aside from running Docker in a VM (which kind of defeats the purpose of Docker)? OS: Fedora 20 (fresh install)
|
Same bug (same error - SELinux is not supported with the BTRFS graph driver!) on Fedora 21 (fresh install) I have tried selinux in permissive and diabled states - same error. Are there any workarounds or fixes forecast or should I reninstall using XFS or similar? |
The only workaround that I've found is to run something like boot2docker or CoreOS in a VM. |
Just adding myself to the list of users that are suffering from this issue: Fedora 20 (on BTRFS) Currently have SELinux disabled. |
Does removing --selinux-enabled from /etc/sysconfig/docker not fix the problem? If you have SELInux disabled and you are seeing this problem then this is a bug. |
Dear RHatDan This "fix" works on my machine (Fedora 21 etc). Thanks for the nudge to this simple solution. Probably should have read (somewhere) that this file exists & needs configuration for SE linux! |
Thank you @rhatdan I hadn't realised that was there. Removing |
If you are on an SELinux disabled system and you have the --selinux-enabled flag in the config docker is not supposed to do SELinux stuff, so it should work on BTRFS. If this is not the case, please open a bugzilla. There is work going on within the kernel team to get SELinux and BTRFS and Docker to play better together. |
Does anyone have a link to somewhere we can follow the status of context mount labelling on btrfs in the kernel? |
So far the discussions have happened on private emails within Red Hat. I will contact the people talking about this to make it public on the selinux mailing list. |
Including a summation, at least, of what's going on and what is being done Thanks. On Wed, Nov 5, 2014 at 4:05 PM, rhatdan notifications@github.com wrote:
|
In the mean time could we add a flag to docker to allow it to attempt starting anyway? Many of us aren't affected by the bug, but have been blocked from upgrading past 1.1.2 for some time now. |
Sammcj? Why are you blocked? Just remove selinux-enabled from /etc/sysconfig/docker and it should work. |
But then Docker won't run with SELinux extensions right? Sent from my iPhone
|
Yes. You can not use SELinux and the BTRFS back end at the same time. BTRFS does not support labels on the mount point yet. |
But it works on 1.1.2? |
Can it write to the container filesystem? |
|
Are you sure SELinux is enabled within the container? IE --selinux-enabled on your docker daemon
|
D'oh! I totally forgot that the latest docker-engine no longer reads Ok, so with
Which suggests to me that Docker is starting and running on btrfs with SElinux, now it's just a policy issue. The second command:
|
If I run
|
Though, I'm still getting |
This means that the fix did not fix the fundamental problem with BTRFS. We need a mechanism to change the SELinux labels on all content based on the mount command. With devicemapper we can do this since they are full mounts. mount -o context="system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" Causes the kernel to treat all inodes within the mount point with this label. In order for us to get BTRFS to work with SELinux in a docker image, we need this support. |
Well, it solved the OP by allowing Docker to start on a machine with SElinux enabled and with That's assuming that the Docker btrfs-based graph driver actually does a subvolume mount for containers like the devicemapper driver does. |
Note that my test system is now showing exactly the same problem as the OP, so I retract my previous statements. I have to do further research, obviously. |
Disregard the previous retraction. I had broken my VM in other ways. My Docker is now starting and running fine with SElinux enabled and btrfs, and can even start containers. It will then throw SElinux errors when trying to perform certain actions, I assume because there is no context sent to the subvolume mount. |
Correct, so the underlying problem is we have to get a way to label subvolumes differently based on mount -i context commands, or we have to solve the problem differently. We are investigating different ways of solving this for OverlayFS, where we label the upper writable content differently then the lower level, but we don't have this working either. :^( |
@rhatdan If OverlayFS isn't working with docker + SELinux, is the only way to run them together on top of devicemapper/thinp? I'd like to have SELinux enforcement around Docker. Thanks. |
Yes currently the only solution is devicemapper/thinp. We have a patch to support SELinux and BTRFS We are real close to getting overlayfs with SELinux support(Although we have been real close for months) :^( |
I was got a problem like this
and fixed by |
Are you using the new version of docker with btrfs/SELinux support (Docker-1.10?) Otherwise this is expected, although the daemon should have blocked you from executing the command? If you are using docker-1.10, please attach the output of ausearch -m avc -ts recent After the failure. |
Still a problem in kernel-4.4.0-0.rc3.git0.1.fc24.x86_64 Given a Btrfs volume with subvolumes A, B, C in the top level, and A and B are each mounted with -o subvol=A and -o subvol=B (i.e. the top level itself is not mounted), when I try to mount subvolume C using -o subvol=C,context"system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" I get: [13301.295040] SELinux: mount invalid. Same superblock, different security settings for (dev sda7, type btrfs) So that's the gist of the problem, is even though I'm asking for a separate fs tree and a mount time context for it, it fails because apparently a given superblock can only have one context. Where with dmthinp, a snapshot of a volume results in a completely independent filesystem copy with its own superblock. A possible work around (untested) is to make a snapshot of the desired subvolume, and recursively relabel its contents, rather than depending on mount -o context=. Not exactly ideal of course. $ mount /dev/sda7 /mnt And now root2 can be mounted whereever and it'll have the desired labeling. On an SSD this took about 9 seconds with -v, and 6 seconds without. |
We have patches to do exactly this in docker now. It would be nice to fix the kernel though so we did not need to do the chcon -r change. |
Mounting with -o subvol= is just a bind mount behind the scenes. I'd expect -B -o context= to fail on XFS or ext4 with a similar message, for the same reason. But with overlayfs usage, maybe there's an advantage to fixing this problem for all filesystems. |
@cmurf I followed your above chcon steps, but I found that even if I use chcon -Rv to change snapshot directory's selinux value, mount -osubvol=xx,context="system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" doesn't work. |
When using chcon -R on a snapshot, I don't use mount -o context at all. The = applies to the context mount option, it doesn't apply to chcon. |
@cmurf I see, so the steps would be only correct? |
Yes, except that chcon command needs a context specified in quotes. I used "system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" only because I found it in some other example. I don't know if it's always c1,c2, I'm willing to bet that one of those values has to be unique in order for contexts to actually separately protect containers. If two containers have the same context, it'll probably still work but then there's a security hole I'd think. |
Correct. If you want BTRFS to work without the latest relabel patch, which I think is in docker-1.10. Then you would need to change the labels to "system_u:object_r:svirt_sandbox_file_t:s0", this is a label all containers can read and write, which means if a container broke out it would be allowed to attack the file systems of other running containers. If you are running docker-1.10, BTRFS should just work with SELinux, although you will pay a startup penalty when you create a new container, usually 1-2 seconds. While docker relabels the content in an image. |
@rhatdan Do I need to install docker-selinux to enable this? this doesn't look like it's been relabel.. Am I missing something? |
I am pretty sure the policy for ubuntu is out of date. You need the latest virt.* policy and you need docker-selinux installed. I think you also need the lxc_contexts file installed in the contexts directory. On Fedora/RHEL/Centos this is in /etc/selinux/targeted/contexts/lxc_contexts
|
@rhatdan
However, I got the error,
It looks like a security issue because if I use --privileged or run without '--selinux-enabled', ls can find all the shared libraries it needs, but I don't see why it can end up with such an error, do you have any ideas? |
I think this was fixed in 1.10 by #16452, and anyone who is still having issues after that should open a new issue. The fix is not perfect, as there is a startup cost, but it works and any other fix would need to be upstream in Linux. |
I'm seeing "Permission denied" errors from shared library code on a Fedora 20 systems using btrfs.
I've replicated this on two bare metal up-to-date Fedora 20 systems running btrfs, including one that was totally fresh. As a control, everything works fine on an up-to-date Fedora 20 system where Docker uses the devicemapper storage driver (running on an OpenStack deployment, also a very fresh install).
Steps to reproduce:
The exact result on btrfs storage driver systems varies with the command I attempt to run, but generally some shared lib fails to load with a permission denied error.
Echo (
sudo docker run fedora echo "hello world"
) causes:A Bash shell (
sudo docker run fedora /bin/bash
) causes:Results on devicemapper storage driver system are as expected.
The output of the three systems is identical for
docker version
:Output of
docker info
for the (failing) fresh btrfs system:Output of
docker info
for the (working) nearly-fresh devicemapper system:The (working) devicemapper install and the (failing) totally fresh btrfs install have the same kernel:
The (failing) not-fresh btrfs install has a newer kernel:
Note that I can prevent this error, even on the btrfs systems, by either putting SELinux in Permissive mode (
sudo setenforce 0
, as described by the OP here) or passing the--privileged
flag (sudo docker run --privileged fedora echo "hello world"
, as described here).My libselinux version is the same on all three systems:
This Docker GitHub Issue comment mentioned switching from aufs to devicemapper as a fix for a seemingly similar issue. That may support the theory that this is a btrfs-related issue.
The text was updated successfully, but these errors were encountered: