-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New implementation of /run support #8478
Conversation
This patch was previously written by Alex Larsson, and I have updated it to work with current docker. We need this in order to get systemd to run properly within a docker container. systemd insists on /run being mounted on a tmpfs, and refuse to fix this, since they claim this is a standard now. |
And it's signed-off in threefold! This must be good 😸 (Just a funny note) |
My 2 cents: |
How about TmpfsRun? Or MountRunFs |
What about specifying a generic flag like I think this would solve your issue and allow generic tmpfs mounts. |
Well no, unless these fixes also copied the content off of the image. If we could make this a daemon option as well then I would fine, so that customers using systemd based images would not blow up when they forgot to do the command. We would ship our distributsions with docker -d --tmpfs /run --tmpfs /tmp |
If it's not enabled manually during "docker run" (and is thus a conscious |
(I mean specifically if it's a new flag on the daemon -- I'm actually |
Well that is one of the reasons to copy the images /run or /tmp onto the tmpfs before the container runs. The patch disables tmpfs for docker build. |
6700536
to
5b35150
Compare
@crosbymichael @tianon @shykes Any additional comment on this one? |
After meeting with @crosbymichael and working on a Read/Only image format, I believe this pull request becomes more important. For Read/Only images you need to still be able to mount tmpfs on /run, /tmp and /var/tmp |
@crosbymichael Any movement? If we could get to the point of /tmp, /var/tmp, /run using similar technology to the /run patch we could make them all tmpfs and then could run the underlying file systems as Read/Only. Then processes could write to these directories as well as /dev. This would more closely match the defaults in RHEL and Fedora and I think Ubunto. For tmpfs on /run and /tmp. I would be willing to make this a daemon option, if necessary. |
Hi, we have systemd-container for RHEL7 base images which is patched version of systemd to work well in Docker containers. There has been a lot of discussions with Lennart Poettering (and other systemd developers) about modifying systemd to work within Docker containers and they provided strong arguments against doing so. This patch is the last thing which prevents us from running systemd in Fedora Docker container. This also applies to CentOS containers and indeed to RHEL7 containers where we wouldn't need patches to workaround missing /run mount... That said, could this, please, get more attention? |
So while I'm +1 for sure on (but, standard disclaimer that I'm not the maintainer here) |
@tianon I agree somewhat, the only question was for @crosbymichael use case of a Read/Only /. But I guess we could just tell apps that use Read/Only to only use /tmp, /run and /dev/shm for temporary content. /var/tmp would be read/only. |
/var/tmp should probably stay untouched, but I'd like to see /tmp and /run to be tmpfs mounts - could we proceed with this and push this change further? ( @tianon @crosbymichael ?) |
I updated the patch and added a second patch to support /tmp on tmpfs. This second patch would allow @crosbymichael to do his Read/Only file systems patch. The question I have is whether we should either allow users to specify this support in the docker run/create commands or in the docker -d. Easy enough to do since we have the MountRun and MountTmp hostconfig options. |
127dc55
to
8c73c0b
Compare
@crosbymichael @shykes @tianon Any update on this? Comment? |
I agree with @crosbymichael that we should look for a more general solution here. I see the benefit of allowing to mount tmpfs directories and I understand that systemd requires /run to be mounted that way. I don't think making this specific to /run is the way to go, though. A more general solution like Mike proposes with So, I'm 👍 on allowing to mount tmpfs directories and I'll be glad to push it forward with a general solution. |
Well I am now thinking of moving the systemd requirements into a different patch set and adding a --systemd switch. There are lots of requirements that systemd and docker do not agree with, and I don't see getting either upstream to agree. Therefore I think we need docker run to support --systemd flag which would tell it that the container will run with a systemd as an init program and set up the container correctly. |
@rhatdan docker already knows how to detect that systemd is running via libcontainer, see https://github.com/docker/docker/blob/53bef64804c6dae6662a7d55c3bb3e48b3e5dfdf/daemon/execdriver/native/driver.go#L62 for instance. If we add |
It can tell if systemd is running on the host ,but not if it will be running as PID 1 in the container. systemd expects things like the SIGTERM to have different meaning then what docker expects. It expects to have /run and /sys/fs/cgroup mounted in the container, I want it to be able to write journal data to the host OS. There are a few other features that are also required to fully support systemd as pid1 in a container. |
6a335bc
to
7aea7c8
Compare
Implementation makes sense for me. But tests failing hard. |
4b444d3
to
e9511e1
Compare
This patch will use the new PreMount and PostMount hooks to "tar" up the contents of the base image on top of tmpfs mount points. Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan) Conflicts: daemon/execdriver/native/create.go
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Just coming up to speed on this issue, but I'm not following something. Based on what I've read so far (in this PR and http://thread.gmane.org/gmane.linux.redhat.fedora.devel/146976) it appears that Also, if we did end up with a flag, I'm not in favor of one on the daemon because if we think some people may not always want it then I'd prefer for it to be on a per-container basis. Or at least allow it for both so that a container can override what the daemon's default is. |
After playing and shipping this patch for a while. We are seeing other problems with it. Biggest one being docker commit does not work the way one would expect. docker commit only saves the underlying image, not anything mounted on top of the container image. This means someone doing a docker run -ti -n myhttpd image /bin/sh; yum install httpd; mkdir /run/httpd; ^d Would not end up getting /run/httpd by default. I am now thinking the best way to handle this is just to have a big --systemd flag. docker run --systemd ... Which would set the container up in the mode that systemd would expect and would mount /run and /sys/fc/cgroup the way systemd would want, as well as generate journald content on the host and send the proper signals to systemd when users do a docker stop ID. But for rank and file containers, we leave /run on the image. |
We've been running systemd in Docker in production for a few months and it mostly works and has helped us do many things where multiple Dockers wouldn't have made sense. Big +1 on --systemd. We still need --tmpfs to be able to use --read-only more widely, without breaking apps that need small scratch directories where -v or VOLUME would be overkill (or having the scratch disk on physical disk would be bad for performance). |
I agree I like the idea of --tmpfs although I think -v tmpfs:/PATH would be more consistent. If docker upstream would decide which to do, I could have a patch available in a couple of hours. I will push to get --systemd pull request together by next week. |
But remember with --systemd or --tmpfs, we need to document that docker commit will not save any content that is stored on tmpfs, or for that matter any volume mounted content onto the new image, which might surprise some users. |
Closing this and replacing it with |
Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)