Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement foreground execution mode #4469

Closed
wants to merge 3 commits into from

Conversation

alexlarsson
Copy link
Contributor

This implements a "-f" switch for docker run, so you can do something like:
docker run -f fedora echo hello

This will run the echo command as a direct decendant of the docker client process, rather than a child of the docker daemon. It will also pass stdin/out/err directly from the docker client to the container.

This is particularly interesting when you're integrating docker containers with some other form of execution manager, such as systemd. Systemd can via cgroups and the normal process hierarchy completely control the container processes, and implement things like restart-on-exit, stdout logging, cgroup containment, integration with systemd unit (so you can e.g. systemctl stop the container).

The way it works is by calling out to the docker daemon which fully sets up the container, but uses a custom execdriver, that when started calls into the docker client via golang rpc calls. The client then creates a real execdriver and proxies the request to that, which starts the container. The client then stays around monitoring the container and exits with the right exit status, as well as telling the docker daamon that the container exited.

Additionally the client accepts a few other RPC calls so that the other daemon side execdriver requests (docker stop, docker ps, docker top, etc) still works. We also implement Restore by reconnecting to the client, so the docker containers survive the daamon restarting.

@SvenDowideit
Copy link
Contributor

@alexlarsson you need to write some documentation, and it needs to explain to the user why and when they would want this (and when not)

I'm guessing the next question users will have is why not make it truly daemon-less - so docker --foreground -H unix://./thiscontainer run -name thiscontainer ubuntu bash

@ibuildthecloud
Copy link
Contributor

Does foreground mode work for non root users?

@ibuildthecloud
Copy link
Contributor

This implementation is very clever, but I can't help but feel like its butting heads with the current design of docker. It seems either docker or this PR need to budge a bit to make this acceptable.

For example, this is implemented as a new execdriver, but the instantiation and management of the driver is all one off. It seems like it would be more elegant if docker went away from the strict one execdriver per daemon and more towards a model of dynamically choosing the correct execdriver at runtime. Then the lifecycle of this execdriver could be more inline and there could be less one line if statements added everywhere.

Having said that, I can't help but think that's there got to be enough tricks that you can do with systemd so that you could still have the main daemon run the execdriver but work with systemd to accomplish your objectives. So while this implementation is fairly generic, maybe there's something more systemd specific you could do that would be closer inline with the docker design. What that is I have no clue, I really don't know enough about this topic to suggest anything.

Finally, last point. This PR now introduces the idea that there are going to be multiple instances of an execdriver running in parallel across processes. That could cause some hangups in the future, I would assume. Something as simple as initialization logic, which is currently assumed to be done once and sequential, will need to give consideration to concurrency. For example, libcontainer now installs an AppArmor profile on startup, while its probably safe to concurrently do that, when that change was made, there was no thought that that code would be ran on every single container start.

@thaJeztah
Copy link
Member

Not sure if this will be a generic approach for all commands in the future, but I just saw @shykes mention a -f flag for docker images in another issue here: #4430 (comment)

If I understand that comment correctly, the idea is to use a -f flag in combination with a keyword as a 'generic' flag. Although this PR will not directly conflict (it's a flag for another command), it would of course be neat if the -f flag will be used consistently for all commands.

(Edit: please excuse me if I use the term flag incorrectly here; read 'flag/switch/option' don't really know what the appropriate term is, LOL )

@ibuildthecloud
Copy link
Contributor

Here's what I propose. What if we changed docker such that on start you can specify multiple allowed execdrivers such as

docker -d -e foreground -e native

Then add Supports(c *Command) bool to execdriver.Driver. At runtime all execdrivers in the list are instantiated and keep in a list of a available drivers. When its time to launch a container you loop through the list looking for the first Supports() == true.

Then on the client side we add exec options that are arbitrary key/value pairs that can be used to control the execdrivers. For example

docker run --exec-conf FOO=BAR --exec-conf FOREGROUND=true

Those options will be added to Command.Config (which I think should be a map, not a []string, but that's not a big deal).

So instead of adding a first class -f arg, which I think is confusing to the user, instead you have a driver specific config/option to do this very specific behaviour you want. The foreground driver would then see FOREGROUND=true in the Config, return true for Supports(), and then when it determine the "realDriver" is just takes the list of all active drivers, strips out itself, and send the list to the client.

I think that will make the server side implementation a bit cleaner. Now I think there needs to be something a little more elegant/generic on the client side so that you don't have to do 'if fg' to determine how to handle the stdin/stdout stuff. That I haven't thought much about.

Side note, ghosts "feature" has been removed in 0.9 so your very elegant approach to ghosts is all for naught. @alexlarsson, I think you should fight the good fight to get reattachment of old containers into 0.10. What that could mean is that we add Driver.Reattach(). If the Driver can't/won't support reattaching, then it returns false/err and docker will do a restart of the container. If it does support reattachment, the driver reattaches and the state is now 'Up.' So Ghost state will never exist any more.

@alexlarsson
Copy link
Contributor Author

I pushed a new version of this based on latest git master.
Additionally it removed the "-f" flag, in favour of just using --foreground.

@alexlarsson
Copy link
Contributor Author

@ibuildthecloud The "foreground" driver is a very special kind of driver. For one it has a one-to-one mapping with the individual docker client instance, so you have to create a new one each time. I'm not sure trying to hook it into some kind of "generic" interface actually makes things simpler.

As for an execdriver being launched multiple times, that can already happen if you run multiple dockers with different versions of /var/lib/docker, as well as in the testsuite. Thats not really new.

I've some other ideas on different approaches to foreground mode that I want to look into also. I'll be back later with results of that.

@ibuildthecloud
Copy link
Contributor

Yeah, I understand it's a very specific use case. I was merely looking for an approach that didn't have to make the implementation of this so first class in docker where there are specific field in the structs just for this. I really think the --foreground option will be very confusing to users (that's why I mentioned some approach of making it a driver specific flag, if such an approach was possible)

I'm particularly interested in this PR being accepted that's why I'm commenting.

I do think the foreground execdriver could be made into a generic rpc driver though. Such that other people could use the driver to implement execdrivers outside of the docker executable.

On Mar 10, 2014, at 3:51 AM, Alexander Larsson notifications@github.com wrote:

@ibuildthecloud The "foreground" driver is a very special kind of driver. For one it has a one-to-one mapping with the individual docker client instance, so you have to create a new one each time. I'm not sure trying to hook it into some kind of "generic" interface actually makes things simpler.

As for an execdriver being launched multiple times, that can already happen if you run multiple dockers with different versions of /var/lib/docker, as well as in the testsuite. Thats not really new.

I've some other ideas on different approaches to foreground mode that I want to look into also. I'll be back later with results of that.


Reply to this email directly or view it on GitHub.

@alexlarsson
Copy link
Contributor Author

Here is a different approach that doesn't use a proxying execdriver:
https://github.com/alexlarsson/docker/tree/fork
It doesn't integrate as fully with commands like "docker stop" and "docker kill", but its much less intrusive.

This way we can have different (non-default) exec drivers for some
containers.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This is a mode of "docker run" that uses direct forking from the
docker client, meaning that the container is a direct child of the
docker run command (rather than the daemon), and that it directly
inherits stdout/err (and stdin if requested) from the parent of the
"docker run" command.

This is very useful in cases where you e.g. want to run a docker
container from systemd (or other process manager), as we can use the
process manager to e.g. set up cgroups and limits, monitor and
restart, pass in fds for socket activation, etc.

The way this works is that there is a new execdriver, which is used
for forground containers. This container just does rpc calls back to
the "docker run" command, which in turn uses a "real" execdriver and
feeds it whatever data the "fake" execdriver got, but in the context
of the docker run command.

We feed information about the process (exit status, pid, etc) to the
daemon, so it is visible just like any other docker container via the
docker apis.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This causes docker to not create a new transient unit for cgroup
containment and resource limits. Instead it will reuse the current
cgroup and apply resource limitations to it using SetUnitProperties.

This is interesting mostly in combination with --foreground, because
it allows you to have a systemd unit file like:

[Service]
ExecStart=docker run --foreground --opt unit.ReuseCurrent=true run image
MemoryLimit=100M

With the following properties:

* Any limits and other properties set up in the service file applies
  to both the docker client and the container "as one unit".
* There is only one unit representing the service
* You can kill the service via systemd which will kill both client and
  container.
* If the client dies systemd will (first nicely then violently) kill the
  other processes in the cgroup (i.e. the container).

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
@shykes
Copy link
Contributor

shykes commented Apr 7, 2014

Hi guys, this PR generated a lot of very useful design discussions, here and on IRC. I think at this point that discussion is advanced enough that we can close this PR. We all acknowledge that it should be possible to "embed" a docker container in an existing process tree. But instead of tacking that on top of the current client-server design (which we're already trying to simplify), we should bake that into the deeper internals of Docker.

@shykes shykes closed this Apr 7, 2014
@alexlarsson
Copy link
Contributor Author

I agree, but for short term use we need this, so I will keep carrying this branch forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants