Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subfolders of read-only volume are writeable #788

Open
2 of 3 tasks
Uatschitchun opened this issue Sep 23, 2019 · 14 comments
Open
2 of 3 tasks

Subfolders of read-only volume are writeable #788

Uatschitchun opened this issue Sep 23, 2019 · 14 comments

Comments

@Uatschitchun
Copy link

Uatschitchun commented Sep 23, 2019

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

I'm running a container and bind mount my /media directory read-only:
-v /media/:/media/:ro
within that /media folder I've mounted my drives containing media:

$ lsblk  
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    0 931,5G  0 disk  
├─sda1        8:1    0 372,5G  0 part /media/vdr
└─sda2        8:2    0   559G  0 part /media/tank
sdb           8:16   0   7,3T  0 disk  
└─sdb1        8:17   0   7,3T  0 part /media/drive
sdc           8:32   0 931,5G  0 disk  
└─sdc1        8:33   0 931,5G  0 part /media/mp3
sdd           8:48   0   3,7T  0 disk  
└─sdd1        8:49   0   3,7T  0 part /media/WD-4TB
sde           8:64   0 149,1G  0 disk  
└─sde1        8:65   0 149,1G  0 part /media/Old

Expected behavior

The expected behavior would be that all subfolders within /media are read-only, but they aren't! The read-only tag isn't inherited to the subfolders.

Actual behavior

The subfolders, like drive, are able to be written to and got written by the process running within the container. No other containers or processes active writing those files.

Steps to reproduce the behavior

Run a container like alpine:
docker run -i -v /media/:/media/:ro -t alpine /bin/sh
where at least one drive is mounted into a subfolder (/media/drive/) and do
mkdir /media/drive/test
which works, despite:

           {
               "Type": "bind",
               "Source": "/media",
               "Destination": "/media",
               "Mode": "ro",
               "RW": false,
               "Propagation": "rprivate"
           }

A re-check with:
docker run -i -v /media/drive/:/media/:ro -t alpine /bin/sh
(note the direct bind mount of a subfolder) gives:

mkdir /media/drive/test
mkdir: cannot create directory '/media/drive/test': Read-only file system

although "docker inspect" states:

           {
               "Type": "bind",
               "Source": "/media/drive",
               "Destination": "/media",
               "Mode": "ro",
               "RW": false,
               "Propagation": "rprivate"
           }

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        6a30dfc
 Built:             Thu Aug 29 05:29:11 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:27:45 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 11
  Running: 6
  Paused: 0
  Stopped: 5
 Images: 8
 Server Version: 19.03.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.0.0-27-generic
 Operating System: Ubuntu 18.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 6
 Total Memory: 7.674GiB
 Name: atom2
 ID: QOGX:WCW3:Y34I:3CTT:6IQ6:AUA4:DXTM:T2KN:7M36:QUP6:SJF6:JXLE
 Docker Root Dir: /home/user/Docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

I've googled around, read the docs and found no restrictions regarding the :ro tag not being inherited in special use cases!?

@thaJeztah
Copy link
Member

thaJeztah commented Sep 23, 2019

Looks similar to moby/moby#37838; basically, the ro / read-only is applied to the mount for which you specified ro, but is not applied to nested mounts.

moby/moby#38003 and the associated CLI changes in docker/cli#1430 added a bind-nonrecursive to the --mount flag (which is the advanced flag for the -v flag).

I'm not sure though if that option works if there's an implicit volume defined in the image that you're running.

@AkihiroSuda ptal

@AkihiroSuda
Copy link

AkihiroSuda commented Sep 23, 2019

Using apparmor seems the best approach here

Maybe we should consider automating these steps like --mount type=bind,bind-readonly-apparmor

@Uatschitchun
Copy link
Author

Uatschitchun commented Sep 23, 2019

Ok, understand ...
There's nothing to be read within documentation regarding the ro flag not working on nested mounts!? (or maybe I'm just not seeing it ;)

@AkihiroSuda
Copy link

AkihiroSuda commented Sep 23, 2019

It is mentioned in https://github.com/docker/cli/blob/master/man/docker-run.1.md , but I agree this should be also mentioned somewhere in https://docs.docker.com/
(EDIT: seems documented in https://docs.docker.com/engine/reference/commandline/service_create/#options-for-bind-mounts )

@AkihiroSuda
Copy link

AkihiroSuda commented Sep 23, 2019

(The apparmor tips should be also documented, maybe after moby/moby#39923 gets merged)

@Uatschitchun
Copy link
Author

Uatschitchun commented Sep 23, 2019

Ok, this should be documented more clearly, esp. here: https://docs.docker.com/storage/volumes/#use-a-read-only-volume
Diving "deep" into bind mounts and submounts, etc, just to get to know that read-only isn't always read-only (like a hardware switch) could have dangerous consequences regarding the sanity of own data. No offense here, just a statement.
Mostly popular docker containers (like media servers (plex, jellyfin, etc.) or others) just promote to add volumes like I did and probably use the ro-flag to prevent data loss or garbage.

@thaJeztah
Copy link
Member

thaJeztah commented Sep 23, 2019

It's tricky, because having writable mounts nested in a read-only mount is also an often-used use-case (e.g. mount /media read-only, but mount a /media/writable-foo/ mount in there for only those parts that should be writable)

@Uatschitchun
Copy link
Author

Uatschitchun commented Sep 24, 2019

The best way to go, if I get it right, is to simply just not bind mount top folders with nested mounts but to bind mount each mount on the host separately, if read-only is wanted.
On my investigations, I stumbled upon more people not being aware of this matter than people being aware. Especially in non-professional communities just liking docker for its simplicity of running services without the knowledge of ACLs, usee/group management, apparmor, etc..
My guess on this would be, to not recursively bind mount submounts as the default!

@thaJeztah
Copy link
Member

thaJeztah commented Sep 24, 2019

@AkihiroSuda do you think we could detect this situation, and return a warning if there's recursive mounts inside a read-only mount?

@szaimen
Copy link

szaimen commented Jan 1, 2021

Any update here? Would appreciate any advice for a workaround for this issue :)

@brgirgis
Copy link

brgirgis commented Jul 7, 2021

Just noticed this issue. This seems like a major security flaw.

@Uatschitchun
Copy link
Author

Uatschitchun commented Jul 31, 2021

Any update here? Would appreciate any advice for a workaround for this issue :)

The workaround is to not bind mount nested mounts but bind mount the nested mounts separately

@cpuguy83
Copy link
Collaborator

cpuguy83 commented Aug 2, 2021

You can specify a mount as non-recursive.. e.g. on the cli --mount type=bind,source=/foo,destination=/bar,bind-nonrecursive=true.

Recurssive bind mount works exactly the same way as it does with straight up Linux, no docker (because actually docker is just making a bind mound system call with rbind).

@allan4229
Copy link

allan4229 commented Feb 23, 2022

Just wondering here... is it serious that this issue is a "feature" because in SOME RANDOM AND VERY SPECIFIC CASES one could want to write a nested dir in a read only volume/bind-mount ???????

Are you serious? For real??????? For an application as large as docker ????? It seems like a lack of competitors here.

What about an optional scenario being just another option and the default being just the default? Am I mad? Am I dumb for saying that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants