New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker bails on systems with btrfs + SELinux w/o regard for SELinux status #7952

Closed
dfarrell07 opened this Issue Sep 9, 2014 · 83 comments

Comments

Projects
None yet
@dfarrell07
Contributor

dfarrell07 commented Sep 9, 2014

I'm seeing "Permission denied" errors from shared library code on a Fedora 20 systems using btrfs.

I've replicated this on two bare metal up-to-date Fedora 20 systems running btrfs, including one that was totally fresh. As a control, everything works fine on an up-to-date Fedora 20 system where Docker uses the devicemapper storage driver (running on an OpenStack deployment, also a very fresh install).

Steps to reproduce:

sudo yum update -y
sudo yum install docker-io -y
sudo systemctl start docker
sudo docker pull fedora
sudo docker run fedora <some command>

The exact result on btrfs storage driver systems varies with the command I attempt to run, but generally some shared lib fails to load with a permission denied error.

Echo (sudo docker run fedora echo "hello world") causes:

echo: error while loading shared libraries: libc.so.6: cannot open shared object file: Permission denied

A Bash shell (sudo docker run fedora /bin/bash) causes:

/bin/bash: error while loading shared libraries: libtinfo.so.5: cannot open shared object file: Permission denied

Results on devicemapper storage driver system are as expected.

The output of the three systems is identical for docker version:

[~]$ sudo docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.2
Git commit (client): d84a070/1.1.2
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.2
Git commit (server): d84a070/1.1.2

Output of docker info for the (failing) fresh btrfs system:

[~]$ sudo docker info
Containers: 4
Images: 3
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.11.10-301.fc20.x86_64

Output of docker info for the (working) nearly-fresh devicemapper system:

[~]$ sudo docker info
Containers: 3
Images: 5
Storage Driver: devicemapper
 Pool Name: docker-252:1-131261-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 975.2 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
Execution Driver: native-0.2
Kernel Version: 3.11.10-301.fc20.x86_64

The (working) devicemapper install and the (failing) totally fresh btrfs install have the same kernel:

# Fresh (failing) btrfs F20 install
[~]$ uname -a
Linux localhost.localdomain 3.11.10-301.fc20.x86_64 #1 SMP Thu Dec 5 14:01:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
# Nearly-fresh (working) devicemapper F20 install
[~]$ uname -a
Linux dfarrell-odl-docker 3.11.10-301.fc20.x86_64 #1 SMP Thu Dec 5 14:01:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

The (failing) not-fresh btrfs install has a newer kernel:

# Not-fresh (failing) btrfs F20 install
[~]$ uname -a
Linux localhost.localdomain 3.15.10-201.fc20.x86_64 #1 SMP Wed Aug 27 21:10:06 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Note that I can prevent this error, even on the btrfs systems, by either putting SELinux in Permissive mode (sudo setenforce 0, as described by the OP here) or passing the --privileged flag (sudo docker run --privileged fedora echo "hello world", as described here).

# Example of proper behavior on fresh btrfs system when SELinux is in Permissive mode
[~]$ getenforce
Enforcing
[~]$ sudo setenforce 0
[~]$ getenforce
Permissive
[~]$ sudo docker run fedora echo "hello world"
hello world
[~]$ sudo setenforce 1
[~]$ sudo docker run fedora echo "hello world"
echo: error while loading shared libraries: libc.so.6: cannot open shared object file: Permission denied
# Example of proper behavior on not-fresh btrfs system when --privileged flag passed
[~]$ sudo getenforce
Enforcing
[~]$ sudo docker run --privileged fedora echo "hello world"
hello world

My libselinux version is the same on all three systems:

[~]$ sudo yum info libselinux
Installed Packages
Name        : libselinux
Arch        : x86_64
Version     : 2.2.1
Release     : 6.fc20
<snip>

This Docker GitHub Issue comment mentioned switching from aufs to devicemapper as a fix for a seemingly similar issue. That may support the theory that this is a btrfs-related issue.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 9, 2014

Note that issue #7318 seems to be related.

The "fix" proposed there was to just disable SELinux. This doesn't fit the use case of folks who would like to use Docker in production (likely okay for the Docker *-dev versions they were using at the time).

@estesp

This comment has been minimized.

Contributor

estesp commented Sep 9, 2014

This section on file system support from the following article published on 9/3 may be helpful (although it doesn't fix your problem): https://opensource.com/business/14/9/security-for-docker

File system support

SELinux currently will only work with the device mapper back end. SELinux does not work with BTRFS. BTRFS does not support context mount labeling yet, which prevents SELinux from relabeling all content when the container starts via the mount command. Kernel engineers are working on a fix for this and potentially Overlayfs if it gets merged into the container.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 9, 2014

SELinux currently will only work with the device mapper back end. SELinux does not work with BTRFS. BTRFS does not support context mount labeling yet, which prevents SELinux from relabeling all content when the container starts via the mount command. Kernel engineers are working on a fix for this and potentially Overlayfs if it gets merged into the container.

That's great info, thank you @estesp.

I found the relevant bug on RHEL7's Bugzilla. It looks like Daniel Walsh has recently assigned someone to work on it, so there should be a fix coming from upstream.

#6452 seems to be where the issue was introduced.

I'm up for closing this, now that it's clear that it needs to be addressed upstream (wasn't clear to me until this morning) and it's linked to the primary bug report.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 9, 2014

It'd be awesome if we could document this issue a bit more clearly, maybe on the install pages for distros that support btrfs. I burned ~12 hours on it yesterday, so the "duplicate no effort" part of me is screaming to help others avoid that pain.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 9, 2014

Pull request #7956 includes doc updates for the issue described here.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 14, 2014

After updating to 1.2.0, I'm seeing #7709.

[~]$ sudo docker version
Client version: 1.2.0
Client API version: 1.14
Go version (client): go1.2.2
Git commit (client): fa7b24f/1.2.0
OS/Arch (client): linux/amd64
2014/09/14 16:42:45 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?
[~]$ sudo systemctl start docker
Job for docker.service failed. See 'systemctl status docker.service' and 'journalctl -xn' for details.

journalctl -xn gives this relevant info:

Sep 14 16:43:20 localhost.localdomain docker[6368]: 2014/09/14 16:43:20 SELinux is not supported with the BTRFS graph driver!

This is actually worse than the error at 1.1.2, as it fails without regard to the status of SELinux.

[~]$ getenforce
Permissive

The pull request that introduced this behavior, #6452, claims that it is checking for the status of SELinux. Based on what I'm seeing, that doesn't appear to be the case.

@dfarrell07

This comment has been minimized.

Contributor

dfarrell07 commented Sep 16, 2014

Now that #7956 and #7709 are closed, this is the correct place to talk about the SELinux + btrfs issue.

As a recap, the 1.2.0 failure behavior is that Docker bails on btrfs systems without regard for the status of SELinux. The expected behavior is that if SELinux is in Permissive mode, Docker should continue.

@dfarrell07 dfarrell07 changed the title from SELinux + btrfs causes shared lib permission errors to Docker bails on systems with SELinux + btrfs Sep 16, 2014

@voxadam

This comment has been minimized.

voxadam commented Sep 19, 2014

I believe that I'm hitting this bug as well. I'm running a fresh Fedora 20 install on btrfs. Are there _any_ workarounds aside from running Docker in a VM (which kind of defeats the purpose of Docker)?

OS: Fedora 20 (fresh install)
Kernel: 3.16.2-200.fc20.x86_64 #1 SMP
Docker: 1.2.0 (2.fc20)
libselinux: 2.2.1 (6.fc20)

 adam@dekatron  ~/bin  sudo setenforce 0                                                                                                                        
 adam@dekatron  ~/bin  sudo getenforce                                                                                                                          
Permissive
 adam@dekatron  ~/bin  sudo systemctl start docker
Job for docker.service failed. See 'systemctl status docker.service' and 'journalctl -xn' for details.
 ✘ adam@dekatron  ~/bin  sudo journalctl -xn        
-- Logs begin at Wed 2014-09-17 14:41:46 PDT, end at Fri 2014-09-19 16:16:52 PDT. --
Sep 19 16:16:49 dekatron.voxadam.com sudo[14285]: adam : TTY=pts/2 ; PWD=/home/adam/bin ; USER=root ; COMMAND=/bin/systemctl start docker
Sep 19 16:16:49 dekatron.voxadam.com systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has begun starting up.
Sep 19 16:16:49 dekatron.voxadam.com docker[14288]: 2014/09/19 16:16:49 docker daemon: 1.2.0 fa7b24f/1.2.0; execdriver: native; graphdriver:
Sep 19 16:16:49 dekatron.voxadam.com docker[14288]: [072bebee] +job serveapi(fd://)
Sep 19 16:16:49 dekatron.voxadam.com docker[14288]: [info] Listening for HTTP on fd ()
Sep 19 16:16:49 dekatron.voxadam.com docker[14288]: 2014/09/19 16:16:49 SELinux is not supported with the BTRFS graph driver!
Sep 19 16:16:49 dekatron.voxadam.com systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Sep 19 16:16:49 dekatron.voxadam.com systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has failed.
-- 
-- The result is failed.
Sep 19 16:16:49 dekatron.voxadam.com systemd[1]: Unit docker.service entered failed state.
Sep 19 16:16:52 dekatron.voxadam.com sudo[14306]: adam : TTY=pts/2 ; PWD=/home/adam/bin ; USER=root ; COMMAND=/bin/journalctl -xn
 adam@dekatron  ~/bin  uname -a
Linux dekatron.voxadam.com 3.16.2-200.fc20.x86_64 #1 SMP Mon Sep 8 11:54:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 adam@dekatron  ~/bin  
@brightonbob

This comment has been minimized.

brightonbob commented Sep 20, 2014

Same bug (same error - SELinux is not supported with the BTRFS graph driver!) on Fedora 21 (fresh install)
Kernel: 3.16.3-300.fc21.x86_64 #1 SMP
Docker: 1.2.0-2.fc21
libselinux: 2.3-4.fc21

I have tried selinux in permissive and diabled states - same error.

Are there any workarounds or fixes forecast or should I reninstall using XFS or similar?

@voxadam

This comment has been minimized.

voxadam commented Sep 20, 2014

The only workaround that I've found is to run something like boot2docker or CoreOS in a VM.

@rosstimson

This comment has been minimized.

rosstimson commented Sep 21, 2014

Just adding myself to the list of users that are suffering from this issue:

Fedora 20 (on BTRFS)
Kernel: 3.16.2-201.fc20.x86_64 #1 SMP
Docker: 1.2.0-2.fc20
libselinux: 2.2.1-6.fc20

Currently have SELinux disabled.

@voxadam

This comment has been minimized.

voxadam commented Sep 22, 2014

I suppose that I should mention that there are a couple related bugs in the Red Hat Bugzilla system. There's one for RHEL, and one for Fedora.

@dfarrell07 dfarrell07 changed the title from Docker bails on systems with SELinux + btrfs to Docker bails on systems with btrfs Sep 22, 2014

@dfarrell07 dfarrell07 changed the title from Docker bails on systems with btrfs to Docker bails on systems with btrfs + SELinux w/o regard for SELinux status Sep 22, 2014

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Sep 22, 2014

Does removing --selinux-enabled from /etc/sysconfig/docker not fix the problem?

If you have SELInux disabled and you are seeing this problem then this is a bug.

@brightonbob

This comment has been minimized.

brightonbob commented Sep 22, 2014

Dear RHatDan

This "fix" works on my machine (Fedora 21 etc). Thanks for the nudge to this simple solution. Probably should have read (somewhere) that this file exists & needs configuration for SE linux!

@rosstimson

This comment has been minimized.

rosstimson commented Sep 22, 2014

Thank you @rhatdan I hadn't realised that was there. Removing --selinux-enabled from /etc/sysconfig/docker does indeed get things working again.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Sep 22, 2014

If you are on an SELinux disabled system and you have the --selinux-enabled flag in the config docker is not supposed to do SELinux stuff, so it should work on BTRFS.

If this is not the case, please open a bugzilla.

There is work going on within the kernel team to get SELinux and BTRFS and Docker to play better together.

@ross-w

This comment has been minimized.

ross-w commented Nov 5, 2014

Does anyone have a link to somewhere we can follow the status of context mount labelling on btrfs in the kernel?

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Nov 6, 2014

So far the discussions have happened on private emails within Red Hat. I will contact the people talking about this to make it public on the selinux mailing list.

@voxadam

This comment has been minimized.

voxadam commented Nov 6, 2014

Including a summation, at least​, of what's going on and what is being done
to mitigate this rather large and important issue (at least for those of us
with the with affected configs) in this and related bug reports would be
greatly appreciated. While I'd love to follow the SELinux list I simply
don't have the mental bandwidth to subscribe to yet another list.

Thanks.

On Wed, Nov 5, 2014 at 4:05 PM, rhatdan notifications@github.com wrote:

So far the discussions have happened on private emails within Red Hat. I
will contact the people talking about this to make it public on the selinux
mailing list.


Reply to this email directly or view it on GitHub
#7952 (comment).

@sammcj

This comment has been minimized.

sammcj commented Nov 13, 2014

In the mean time could we add a flag to docker to allow it to attempt starting anyway?

Many of us aren't affected by the bug, but have been blocked from upgrading past 1.1.2 for some time now.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Nov 13, 2014

Sammcj? Why are you blocked? Just remove selinux-enabled from /etc/sysconfig/docker and it should work.

@sammcj

This comment has been minimized.

sammcj commented Nov 13, 2014

But then Docker won't run with SELinux extensions right?

Sent from my iPhone

On 14 Nov 2014, at 12:44 am, rhatdan notifications@github.com wrote:

Sammcj? Why are you blocked? Just remove selinux-enabled from /etc/sysconfig/docker and it should work.


Reply to this email directly or view it on GitHub.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Nov 13, 2014

Yes. You can not use SELinux and the BTRFS back end at the same time. BTRFS does not support labels on the mount point yet.

@sammcj

This comment has been minimized.

sammcj commented Nov 13, 2014

But it works on 1.1.2?

@stephensmalley

This comment has been minimized.

stephensmalley commented Sep 2, 2015

Can it write to the container filesystem?
What does 'docker run fedora grep context /proc/self/mounts' show?

@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 2, 2015

  1. Yes, it can:
$ docker run fedora echo "Hello World" >> test && cat test
Hello World
  1. It shows nothing, i.e.no result. The entire content of /proc/self/mounts is:
$ docker run fedora cat /proc/self/mounts
/dev/sdb1 / btrfs rw,seclabel,relatime,space_cache 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev tmpfs rw,seclabel,nosuid,mode=755 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666 0 0
shm /dev/shm tmpfs rw,seclabel,nosuid,nodev,noexec,relatime,size=65536k 0 0
mqueue /dev/mqueue mqueue rw,seclabel,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs ro,seclabel,nosuid,nodev,noexec,relatime 0 0
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,relatime,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup ro,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/cpuset cgroup ro,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup ro,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup ro,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup ro,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup ro,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup ro,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup ro,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup ro,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup ro,nosuid,nodev,noexec,relatime,hugetlb 0 0
/dev/sdb1 /etc/resolv.conf btrfs rw,seclabel,relatime,space_cache 0 0
/dev/sdb1 /etc/hostname btrfs rw,seclabel,relatime,space_cache 0 0
/dev/sdb1 /etc/hosts btrfs rw,seclabel,relatime,space_cache 0 0
proc /proc/asound proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /proc/kcore tmpfs rw,seclabel,nosuid,mode=755 0 0
tmpfs /proc/timer_stats tmpfs rw,seclabel,nosuid,mode=755 0 0
@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Sep 3, 2015

Are you sure SELinux is enabled within the container? IE --selinux-enabled on your docker daemon

docker run fedora ls -lZ /
docker run fedora cat /proc/self/attr/current
@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 3, 2015

D'oh! I totally forgot that the latest docker-engine no longer reads /etc/sysconfig/docker on OL7. My bad.

Ok, so with --selinux-enabled the container still runs, I just get SElinux exceptions:

$ docker run fedora ls -lZ /
ls: cannot open directory /: Permission denied

Which suggests to me that Docker is starting and running on btrfs with SElinux, now it's just a policy issue.

The second command:

$ docker run fedora cat /proc/self/attr/current
system_u:system_r:svirt_lxc_net_t:s0:c56,c894
@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 3, 2015

If I run restorecon on /var/lib/docker then things start to work again:

# docker run --rm -t -i fedora ls -lZ /
total 16
lrwxrwxrwx.   1 root root system_u:object_r:docker_var_lib_t:s0                  7 Aug 16  2014 bin -> usr/bin
dr-xr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 Aug 16  2014 boot
drwxr-xr-x.   5 root root system_u:object_r:svirt_sandbox_file_t:s0:c406,c977  380 Sep  3 21:00 dev
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0               1856 Sep  3 21:00 etc
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 May 21 23:59 home
lrwxrwxrwx.   1 root root system_u:object_r:docker_var_lib_t:s0                  7 Aug 16  2014 lib -> usr/lib
lrwxrwxrwx.   1 root root system_u:object_r:docker_var_lib_t:s0                  9 Aug 16  2014 lib64 -> usr/lib64
drwx------.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 May 21 23:58 lost+found
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 Aug 16  2014 media
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 Aug 16  2014 mnt
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 Aug 16  2014 opt
dr-xr-xr-x. 269 root root system_u:object_r:proc_t:s0                            0 Sep  3 21:00 proc
dr-xr-x---.   1 root root system_u:object_r:docker_var_lib_t:s0                120 May 21 23:59 root
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 May 21 23:58 run
lrwxrwxrwx.   1 root root system_u:object_r:docker_var_lib_t:s0                  8 Aug 16  2014 sbin -> usr/sbin
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                  0 Aug 16  2014 srv
dr-xr-xr-x.  13 root root system_u:object_r:sysfs_t:s0                           0 Sep  3 20:52 sys
drwxrwxrwt.   1 root root system_u:object_r:docker_var_lib_t:s0                160 May 21 23:59 tmp
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                100 May 21 23:58 usr
drwxr-xr-x.   1 root root system_u:object_r:docker_var_lib_t:s0                160 May 21 23:58 var
@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 3, 2015

Though, I'm still getting Permission Denied errors trying to create files, so something's not working with labelling of the filesystem. So, progress?

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Sep 4, 2015

This means that the fix did not fix the fundamental problem with BTRFS. We need a mechanism to change the SELinux labels on all content based on the mount command. With devicemapper we can do this since they are full mounts.

mount -o context="system_u:object_r:svirt_sandbox_file_t:s0:c1,c2"

Causes the kernel to treat all inodes within the mount point with this label. In order for us to get BTRFS to work with SELinux in a docker image, we need this support.

@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 4, 2015

Well, it solved the OP by allowing Docker to start on a machine with SElinux enabled and with --selinux-enabled for the Docker daemon. Now we have to fix the other problem by getting Docker to add the SElinux context to btrfs-based subvolume mounts.

That's assuming that the Docker btrfs-based graph driver actually does a subvolume mount for containers like the devicemapper driver does.

@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 5, 2015

Note that my test system is now showing exactly the same problem as the OP, so I retract my previous statements. I have to do further research, obviously.

@Djelibeybi

This comment has been minimized.

Contributor

Djelibeybi commented Sep 6, 2015

Disregard the previous retraction. I had broken my VM in other ways. My Docker is now starting and running fine with SElinux enabled and btrfs, and can even start containers. It will then throw SElinux errors when trying to perform certain actions, I assume because there is no context sent to the subvolume mount.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Sep 7, 2015

Correct, so the underlying problem is we have to get a way to label subvolumes differently based on mount -i context commands, or we have to solve the problem differently. We are investigating different ways of solving this for OverlayFS, where we label the upper writable content differently then the lower level, but we don't have this working either. :^(

@dperson

This comment has been minimized.

dperson commented Oct 30, 2015

@rhatdan If OverlayFS isn't working with docker + SELinux, is the only way to run them together on top of devicemapper/thinp? I'd like to have SELinux enforcement around Docker. Thanks.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Oct 30, 2015

Yes currently the only solution is devicemapper/thinp. We have a patch to support SELinux and BTRFS

#16452

We are real close to getting overlayfs with SELinux support(Although we have been real close for months) :^(

@kk580kk

This comment has been minimized.

kk580kk commented Nov 23, 2015

I was got a problem like this

➜  ~  kubectl logs wordpress-controller-jvnij   
warning: both WORDPRESS_DB_HOST and MYSQL_PORT_3306_TCP found
  Connecting to WORDPRESS_DB_HOST (10.254.18.220)
  instead of the linked mysql container
sed: couldn't open temporary file ./sedFEDbAR: Permission denied

and fixed by
remove --selinux-enabledfrom /etc/sysconfig/docker

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Nov 23, 2015

Are you using the new version of docker with btrfs/SELinux support (Docker-1.10?) Otherwise this is expected, although the daemon should have blocked you from executing the command?

If you are using docker-1.10, please attach the output of

ausearch -m avc -ts recent

After the failure.

@cmurf

This comment has been minimized.

cmurf commented Dec 4, 2015

Still a problem in kernel-4.4.0-0.rc3.git0.1.fc24.x86_64

Given a Btrfs volume with subvolumes A, B, C in the top level, and A and B are each mounted with -o subvol=A and -o subvol=B (i.e. the top level itself is not mounted), when I try to mount subvolume C using -o subvol=C,context"system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" I get:

[13301.295040] SELinux: mount invalid. Same superblock, different security settings for (dev sda7, type btrfs)

So that's the gist of the problem, is even though I'm asking for a separate fs tree and a mount time context for it, it fails because apparently a given superblock can only have one context. Where with dmthinp, a snapshot of a volume results in a completely independent filesystem copy with its own superblock.

A possible work around (untested) is to make a snapshot of the desired subvolume, and recursively relabel its contents, rather than depending on mount -o context=. Not exactly ideal of course.

$ mount /dev/sda7 /mnt
$ btrfs sub snap root root2
$ chcon -Rv "system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" root2

And now root2 can be mounted whereever and it'll have the desired labeling. On an SSD this took about 9 seconds with -v, and 6 seconds without.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Dec 7, 2015

We have patches to do exactly this in docker now. It would be nice to fix the kernel though so we did not need to do the chcon -r change.

@cmurf

This comment has been minimized.

cmurf commented Dec 7, 2015

Mounting with -o subvol= is just a bind mount behind the scenes. I'd expect -B -o context= to fail on XFS or ext4 with a similar message, for the same reason. But with overlayfs usage, maybe there's an advantage to fixing this problem for all filesystems.

@liubogithub

This comment has been minimized.

liubogithub commented Jan 21, 2016

@cmurf I followed your above chcon steps, but I found that even if I use chcon -Rv to change snapshot directory's selinux value, mount -osubvol=xx,context="system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" doesn't work.
I noticed that you're using context"system_u:object_r:svirt_sandbox_file_t:s0:c1,c2", which is without a '=', is it a typo?

@cmurf

This comment has been minimized.

cmurf commented Jan 21, 2016

When using chcon -R on a snapshot, I don't use mount -o context at all. The = applies to the context mount option, it doesn't apply to chcon.

@liubogithub

This comment has been minimized.

liubogithub commented Jan 21, 2016

@cmurf I see, so the steps would be only
$ chcon -R snapshot
$ mount -osubvol=snapshot /dev /mnt

correct?

@cmurf

This comment has been minimized.

cmurf commented Jan 21, 2016

Yes, except that chcon command needs a context specified in quotes. I used "system_u:object_r:svirt_sandbox_file_t:s0:c1,c2" only because I found it in some other example. I don't know if it's always c1,c2, I'm willing to bet that one of those values has to be unique in order for contexts to actually separately protect containers. If two containers have the same context, it'll probably still work but then there's a security hole I'd think.

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Jan 22, 2016

Correct. If you want BTRFS to work without the latest relabel patch, which I think is in docker-1.10. Then you would need to change the labels to "system_u:object_r:svirt_sandbox_file_t:s0", this is a label all containers can read and write, which means if a container broke out it would be allowed to attack the file systems of other running containers.

If you are running docker-1.10, BTRFS should just work with SELinux, although you will pay a startup penalty when you create a new container, usually 1-2 seconds. While docker relabels the content in an image.

@liubogithub

This comment has been minimized.

liubogithub commented Jan 23, 2016

@rhatdan Do I need to install docker-selinux to enable this?
Right now, I'm using my own compile docker binary from the latest docker github repo, the version is 1.10.1.
I used 'getenforce' to make sure selinux is enabled, and ran 'docker daemon --selinux-enabled',
however after 'docker run -it ubuntu bash', I 'secon --file' to check /var/lib/docker/btrfs/subvolumes/,
got these,
user: system_u
role: object_r
type: var_lib_t
sensitivity: s0
clearance: s0
mls-range: s0

this doesn't look like it's been relabel.. Am I missing something?

@rhatdan

This comment has been minimized.

Contributor

rhatdan commented Jan 25, 2016

I am pretty sure the policy for ubuntu is out of date. You need the latest virt.* policy and you need docker-selinux installed.

I think you also need the lxc_contexts file installed in the contexts directory.

On Fedora/RHEL/Centos this is in /etc/selinux/targeted/contexts/lxc_contexts

 python -c "import selinux; print(selinux.selinux_lxc_contexts_path())" 
/etc/selinux/targeted/contexts/lxc_contexts
# cat /etc/selinux/targeted/contexts/lxc_contexts 
process = "system_u:system_r:svirt_lxc_net_t:s0"
content = "system_u:object_r:virt_var_lib_t:s0"
file = "system_u:object_r:svirt_sandbox_file_t:s0"
sandbox_kvm_process = "system_u:system_r:svirt_qemu_net_t:s0"
sandbox_kvm_process = "system_u:system_r:svirt_qemu_net_t:s0"
sandbox_lxc_process = "system_u:system_r:svirt_lxc_net_t:s0"
@liubogithub

This comment has been minimized.

liubogithub commented Feb 9, 2016

@rhatdan
I added mount label support for btrfs subvolumes and removed the label.relabel() part in d.Create(), now with '--selinux-enabled', by checking manually while running a container, I can see files inside the container has the desired security label like

[root@localhost ~]# ls -lZ /var/lib/docker/btrfs/mnt/5824ea0015a9695954c3d67888caf1b4e59f1504b9b6e6064470be8285dd07fa/lib/x86_64-linux-gnu/*selinux* -rw-r--r--. 1 root root system_u:object_r:svirt_sandbox_file_t:s0:c568,c1009 134296 Apr 29 2014 /var/lib/docker/btrfs/mnt/5824ea0015a9695954c3d67888caf1b4e59f1504b9b6e6064470be8285dd07fa/lib/x86_64-linux-gnu/libselinux.so.1

However, I got the error,

ls: error while loading shared libraries: libselinux.so.1: cannot open shared object file: No such file or directory

It looks like a security issue because if I use --privileged or run without '--selinux-enabled', ls can find all the shared libraries it needs, but I don't see why it can end up with such an error, do you have any ideas?

@justincormack

This comment has been minimized.

Contributor

justincormack commented Apr 28, 2016

I think this was fixed in 1.10 by #16452, and anyone who is still having issues after that should open a new issue. The fix is not perfect, as there is a startup cost, but it works and any other fix would need to be upstream in Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment