Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New implementation of /run support #8478

Closed
wants to merge 4 commits into from
Closed

New implementation of /run support #8478

wants to merge 4 commits into from

Conversation

rhatdan
Copy link
Contributor

@rhatdan rhatdan commented Oct 8, 2014

This mounts a /run tmpfs into the container, with the initial contents
copies from the /run in the base image, unless NoRunFs is set in the
HostConfig.

Additionally NoRunFs is always set during a docker build, which means
any setup of /run in a Dockerfile is saved in the image to be copied
into the final /run tmpfs when a container is started.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)

Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)

Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 8, 2014

This patch was previously written by Alex Larsson, and I have updated it to work with current docker.
There was a previous /run patch which was rejected because it did not copy /run off the image onto /run in a /tmpfs. This patch fixes this issue.

We need this in order to get systemd to run properly within a docker container. systemd insists on /run being mounted on a tmpfs, and refuse to fix this, since they claim this is a standard now.

@thaJeztah
Copy link
Member

And it's signed-off in threefold! This must be good 😸

(Just a funny note)

@haraldh
Copy link

haraldh commented Oct 10, 2014

My 2 cents:
"NoRunFs=no" sounds awful. It's already a boolean. Why does it have to have "No" in front?
"MountRunFS=yes" as a default sounds more sane.

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 10, 2014

How about TmpfsRun? Or MountRunFs

@crosbymichael
Copy link
Contributor

What about specifying a generic flag like
docker run --tmpfs /run --tmpfs /tmp ubuntu bash?

I think this would solve your issue and allow generic tmpfs mounts.

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 10, 2014

Well no, unless these fixes also copied the content off of the image.

If we could make this a daemon option as well then I would fine, so that customers using systemd based images would not blow up when they forgot to do the command.

We would ship our distributsions with

docker -d --tmpfs /run --tmpfs /tmp

@tianon
Copy link
Member

tianon commented Oct 10, 2014

If it's not enabled manually during "docker run" (and is thus a conscious
choice) or unilaterally for all containers on every engine, I think this
would have some huge potential to cause issues with image portability.

@tianon
Copy link
Member

tianon commented Oct 10, 2014

(I mean specifically if it's a new flag on the daemon -- I'm actually
+1 to having it enabled everywhere, but having it as a "docker run"
flag would be very interesting as well.)

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 10, 2014

Well that is one of the reasons to copy the images /run or /tmp onto the tmpfs before the container runs.
Then have it saved on docker commit.

The patch disables tmpfs for docker build.

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 23, 2014

@crosbymichael @tianon @shykes Any additional comment on this one?

@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 30, 2014

After meeting with @crosbymichael and working on a Read/Only image format, I believe this pull request becomes more important. For Read/Only images you need to still be able to mount tmpfs on /run, /tmp and /var/tmp
I think we could extend this patch so that /tmp and /var/tmp work the same way. But /run should be default tmpfs, just like it is on Ubunto, RHEL7, Fedora, Centos.

@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 11, 2014

@crosbymichael Any movement? If we could get to the point of /tmp, /var/tmp, /run using similar technology to the /run patch we could make them all tmpfs and then could run the underlying file systems as Read/Only. Then processes could write to these directories as well as /dev.

This would more closely match the defaults in RHEL and Fedora and I think Ubunto. For tmpfs on /run and /tmp.

I would be willing to make this a daemon option, if necessary.

@vpavlin
Copy link

vpavlin commented Nov 12, 2014

Hi, we have systemd-container for RHEL7 base images which is patched version of systemd to work well in Docker containers. There has been a lot of discussions with Lennart Poettering (and other systemd developers) about modifying systemd to work within Docker containers and they provided strong arguments against doing so. This patch is the last thing which prevents us from running systemd in Fedora Docker container. This also applies to CentOS containers and indeed to RHEL7 containers where we wouldn't need patches to workaround missing /run mount...

That said, could this, please, get more attention?

@tianon
Copy link
Member

tianon commented Nov 13, 2014

/run as tmpfs is definitely very standard at this point (across almost all distributions), and /tmp as tmpfs is very, very common, but /var/tmp is defined as "Temporary files preserved between system reboots".

So while I'm +1 for sure on /run and could personally be persuaded that /tmp by default makes sense, I'm strongly -1 on /var/tmp. 👍

(but, standard disclaimer that I'm not the maintainer here)

@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 13, 2014

@tianon I agree somewhat, the only question was for @crosbymichael use case of a Read/Only /. But I guess we could just tell apps that use Read/Only to only use /tmp, /run and /dev/shm for temporary content. /var/tmp would be read/only.

@vpavlin
Copy link

vpavlin commented Nov 24, 2014

/var/tmp should probably stay untouched, but I'd like to see /tmp and /run to be tmpfs mounts - could we proceed with this and push this change further? ( @tianon @crosbymichael ?)

@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 24, 2014

I updated the patch and added a second patch to support /tmp on tmpfs.

This second patch would allow @crosbymichael to do his Read/Only file systems patch.

The question I have is whether we should either allow users to specify this support in the docker run/create commands or in the docker -d. Easy enough to do since we have the MountRun and MountTmp hostconfig options.

@rhatdan
Copy link
Contributor Author

rhatdan commented Dec 2, 2014

@crosbymichael @shykes @tianon Any update on this? Comment?

@calavera
Copy link
Contributor

I agree with @crosbymichael that we should look for a more general solution here. I see the benefit of allowing to mount tmpfs directories and I understand that systemd requires /run to be mounted that way.

I don't think making this specific to /run is the way to go, though. A more general solution like Mike proposes with --tmpfs allows to fulfill systemd's requirements and is not coupled to it at the same time. Besides, this is a hard linux implementation, using --tmpfs would allow us to abstract the implementation by OS. Windows is coming.

So, I'm 👍 on allowing to mount tmpfs directories and I'll be glad to push it forward with a general solution.

@rhatdan
Copy link
Contributor Author

rhatdan commented Apr 29, 2015

Well I am now thinking of moving the systemd requirements into a different patch set and adding a --systemd switch. There are lots of requirements that systemd and docker do not agree with, and I don't see getting either upstream to agree. Therefore I think we need docker run to support --systemd flag which would tell it that the container will run with a systemd as an init program and set up the container correctly.
Then container that do not want to run in systemd mode can continue to run as they currently do. Only problem with doing this would be users confusion in running systemd based containers failing because they forget to run with the --systemd flag.

@calavera
Copy link
Contributor

@rhatdan docker already knows how to detect that systemd is running via libcontainer, see https://github.com/docker/docker/blob/53bef64804c6dae6662a7d55c3bb3e48b3e5dfdf/daemon/execdriver/native/driver.go#L62 for instance.

If we add --tmpfs as a flag, there is nothing stopping us to set --tmpfs /run by default when systemd is running. That way we get both, a generic implementation and a systemd sane default.

@rhatdan
Copy link
Contributor Author

rhatdan commented Apr 29, 2015

It can tell if systemd is running on the host ,but not if it will be running as PID 1 in the container. systemd expects things like the SIGTERM to have different meaning then what docker expects. It expects to have /run and /sys/fs/cgroup mounted in the container, I want it to be able to write journal data to the host OS. There are a few other features that are also required to fully support systemd as pid1 in a container.

@rhatdan rhatdan force-pushed the run branch 3 times, most recently from 6a335bc to 7aea7c8 Compare May 12, 2015 20:56
@LK4D4
Copy link
Contributor

LK4D4 commented May 12, 2015

Implementation makes sense for me. But tests failing hard.
Also this doing this by default, so I moving to design-review.
ping @docker/core-maintainers

This patch will use the new PreMount and PostMount hooks to "tar"
up the contents of the base image on top of tmpfs mount points.

Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)

Conflicts:
	daemon/execdriver/native/create.go
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
@duglin
Copy link
Contributor

duglin commented May 25, 2015

Just coming up to speed on this issue, but I'm not following something. Based on what I've read so far (in this PR and http://thread.gmane.org/gmane.linux.redhat.fedora.devel/146976) it appears that /run mounted as tmpfs is standard, right? If so, why wouldn't Docker do the same thing by default? Why would we want to require people to add a flag to enable something that would be enabled by default if they were running natively, outside of Docker?

Also, if we did end up with a flag, I'm not in favor of one on the daemon because if we think some people may not always want it then I'd prefer for it to be on a per-container basis. Or at least allow it for both so that a container can override what the daemon's default is.

@icecrime icecrime removed this from the 1.7.0 milestone May 26, 2015
@icecrime icecrime added the status/needs-attention Calls for a collective discussion during a review session label May 26, 2015
@rhatdan
Copy link
Contributor Author

rhatdan commented May 27, 2015

After playing and shipping this patch for a while. We are seeing other problems with it. Biggest one being docker commit does not work the way one would expect. docker commit only saves the underlying image, not anything mounted on top of the container image. This means someone doing a

docker run -ti -n myhttpd image /bin/sh; yum install httpd; mkdir /run/httpd; ^d
docker commit myhttpd httpd

Would not end up getting /run/httpd by default.

I am now thinking the best way to handle this is just to have a big --systemd flag.

docker run --systemd ...

Which would set the container up in the mode that systemd would expect and would mount /run and /sys/fc/cgroup the way systemd would want, as well as generate journald content on the host and send the proper signals to systemd when users do a docker stop ID.

But for rank and file containers, we leave /run on the image.

@alberts
Copy link

alberts commented May 27, 2015

We've been running systemd in Docker in production for a few months and it mostly works and has helped us do many things where multiple Dockers wouldn't have made sense. Big +1 on --systemd. We still need --tmpfs to be able to use --read-only more widely, without breaking apps that need small scratch directories where -v or VOLUME would be overkill (or having the scratch disk on physical disk would be bad for performance).

@rhatdan
Copy link
Contributor Author

rhatdan commented May 27, 2015

I agree I like the idea of --tmpfs although I think -v tmpfs:/PATH would be more consistent. If docker upstream would decide which to do, I could have a patch available in a couple of hours.

I will push to get --systemd pull request together by next week.

@rhatdan
Copy link
Contributor Author

rhatdan commented May 27, 2015

But remember with --systemd or --tmpfs, we need to document that docker commit will not save any content that is stored on tmpfs, or for that matter any volume mounted content onto the new image, which might surprise some users.

@rhatdan
Copy link
Contributor Author

rhatdan commented May 28, 2015

Closing this and replacing it with

#13525

@rhatdan rhatdan closed this May 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/needs-attention Calls for a collective discussion during a review session status/1-design-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet