New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement foreground execution mode #4469
Conversation
@alexlarsson you need to write some documentation, and it needs to explain to the user why and when they would want this (and when not) I'm guessing the next question users will have is why not make it truly daemon-less - so |
Does foreground mode work for non root users? |
This implementation is very clever, but I can't help but feel like its butting heads with the current design of docker. It seems either docker or this PR need to budge a bit to make this acceptable. For example, this is implemented as a new execdriver, but the instantiation and management of the driver is all one off. It seems like it would be more elegant if docker went away from the strict one execdriver per daemon and more towards a model of dynamically choosing the correct execdriver at runtime. Then the lifecycle of this execdriver could be more inline and there could be less one line if statements added everywhere. Having said that, I can't help but think that's there got to be enough tricks that you can do with systemd so that you could still have the main daemon run the execdriver but work with systemd to accomplish your objectives. So while this implementation is fairly generic, maybe there's something more systemd specific you could do that would be closer inline with the docker design. What that is I have no clue, I really don't know enough about this topic to suggest anything. Finally, last point. This PR now introduces the idea that there are going to be multiple instances of an execdriver running in parallel across processes. That could cause some hangups in the future, I would assume. Something as simple as initialization logic, which is currently assumed to be done once and sequential, will need to give consideration to concurrency. For example, libcontainer now installs an AppArmor profile on startup, while its probably safe to concurrently do that, when that change was made, there was no thought that that code would be ran on every single container start. |
Not sure if this will be a generic approach for all commands in the future, but I just saw @shykes mention a If I understand that comment correctly, the idea is to use a (Edit: please excuse me if I use the term flag incorrectly here; read 'flag/switch/option' don't really know what the appropriate term is, LOL ) |
Here's what I propose. What if we changed docker such that on start you can specify multiple allowed execdrivers such as
Then add Then on the client side we add exec options that are arbitrary key/value pairs that can be used to control the execdrivers. For example
Those options will be added to Command.Config (which I think should be a map, not a []string, but that's not a big deal). So instead of adding a first class I think that will make the server side implementation a bit cleaner. Now I think there needs to be something a little more elegant/generic on the client side so that you don't have to do 'if fg' to determine how to handle the stdin/stdout stuff. That I haven't thought much about. Side note, ghosts "feature" has been removed in 0.9 so your very elegant approach to ghosts is all for naught. @alexlarsson, I think you should fight the good fight to get reattachment of old containers into 0.10. What that could mean is that we add Driver.Reattach(). If the Driver can't/won't support reattaching, then it returns false/err and docker will do a restart of the container. If it does support reattachment, the driver reattaches and the state is now 'Up.' So Ghost state will never exist any more. |
I pushed a new version of this based on latest git master. |
@ibuildthecloud The "foreground" driver is a very special kind of driver. For one it has a one-to-one mapping with the individual docker client instance, so you have to create a new one each time. I'm not sure trying to hook it into some kind of "generic" interface actually makes things simpler. As for an execdriver being launched multiple times, that can already happen if you run multiple dockers with different versions of /var/lib/docker, as well as in the testsuite. Thats not really new. I've some other ideas on different approaches to foreground mode that I want to look into also. I'll be back later with results of that. |
Yeah, I understand it's a very specific use case. I was merely looking for an approach that didn't have to make the implementation of this so first class in docker where there are specific field in the structs just for this. I really think the --foreground option will be very confusing to users (that's why I mentioned some approach of making it a driver specific flag, if such an approach was possible) I'm particularly interested in this PR being accepted that's why I'm commenting. I do think the foreground execdriver could be made into a generic rpc driver though. Such that other people could use the driver to implement execdrivers outside of the docker executable.
|
Here is a different approach that doesn't use a proxying execdriver: |
This way we can have different (non-default) exec drivers for some containers. Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This is a mode of "docker run" that uses direct forking from the docker client, meaning that the container is a direct child of the docker run command (rather than the daemon), and that it directly inherits stdout/err (and stdin if requested) from the parent of the "docker run" command. This is very useful in cases where you e.g. want to run a docker container from systemd (or other process manager), as we can use the process manager to e.g. set up cgroups and limits, monitor and restart, pass in fds for socket activation, etc. The way this works is that there is a new execdriver, which is used for forground containers. This container just does rpc calls back to the "docker run" command, which in turn uses a "real" execdriver and feeds it whatever data the "fake" execdriver got, but in the context of the docker run command. We feed information about the process (exit status, pid, etc) to the daemon, so it is visible just like any other docker container via the docker apis. Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This causes docker to not create a new transient unit for cgroup containment and resource limits. Instead it will reuse the current cgroup and apply resource limitations to it using SetUnitProperties. This is interesting mostly in combination with --foreground, because it allows you to have a systemd unit file like: [Service] ExecStart=docker run --foreground --opt unit.ReuseCurrent=true run image MemoryLimit=100M With the following properties: * Any limits and other properties set up in the service file applies to both the docker client and the container "as one unit". * There is only one unit representing the service * You can kill the service via systemd which will kill both client and container. * If the client dies systemd will (first nicely then violently) kill the other processes in the cgroup (i.e. the container). Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Hi guys, this PR generated a lot of very useful design discussions, here and on IRC. I think at this point that discussion is advanced enough that we can close this PR. We all acknowledge that it should be possible to "embed" a docker container in an existing process tree. But instead of tacking that on top of the current client-server design (which we're already trying to simplify), we should bake that into the deeper internals of Docker. |
I agree, but for short term use we need this, so I will keep carrying this branch forward. |
This implements a "-f" switch for docker run, so you can do something like:
docker run -f fedora echo hello
This will run the echo command as a direct decendant of the docker client process, rather than a child of the docker daemon. It will also pass stdin/out/err directly from the docker client to the container.
This is particularly interesting when you're integrating docker containers with some other form of execution manager, such as systemd. Systemd can via cgroups and the normal process hierarchy completely control the container processes, and implement things like restart-on-exit, stdout logging, cgroup containment, integration with systemd unit (so you can e.g. systemctl stop the container).
The way it works is by calling out to the docker daemon which fully sets up the container, but uses a custom execdriver, that when started calls into the docker client via golang rpc calls. The client then creates a real execdriver and proxies the request to that, which starts the container. The client then stays around monitoring the container and exits with the right exit status, as well as telling the docker daamon that the container exited.
Additionally the client accepts a few other RPC calls so that the other daemon side execdriver requests (docker stop, docker ps, docker top, etc) still works. We also implement Restore by reconnecting to the client, so the docker containers survive the daamon restarting.