Auto-restart processes #26

Closed
shykes opened this Issue Mar 9, 2013 · 52 comments

Comments

Projects
None yet
@shykes
Collaborator

shykes commented Mar 9, 2013

Docker should be capable of restarting its child processes when they exit or are killed. This behavior should be optional: sometimes you don't want the process to be restarted (singleton jobs, interactive shells...)

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Jul 26, 2013

Collaborator

The current -r flag is insufficient for a production setup. Docker should out of the box auto-detect which containers need to be restarted, without having to start it in a special "recovery" mode.

Collaborator

shykes commented Jul 26, 2013

The current -r flag is insufficient for a production setup. Docker should out of the box auto-detect which containers need to be restarted, without having to start it in a special "recovery" mode.

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Sep 7, 2013

Collaborator

This is scheduled for 0.7.

Collaborator

shykes commented Sep 7, 2013

This is scheduled for 0.7.

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Nov 2, 2013

Contributor

@vieux Isn't this fixed by #1832?

Contributor

unclejack commented Nov 2, 2013

@vieux Isn't this fixed by #1832?

@bfirsh

This comment has been minimized.

Show comment
Hide comment
@bfirsh

bfirsh Nov 2, 2013

Contributor

@unclejack As far as I know, #1832 is for starting containers that were previously running when starting the Docker daemon. This issue is about restarting containers when they quit.

Contributor

bfirsh commented Nov 2, 2013

@unclejack As far as I know, #1832 is for starting containers that were previously running when starting the Docker daemon. This issue is about restarting containers when they quit.

crosbymichael referenced this issue in crosbymichael/docker Nov 14, 2013

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Jan 22, 2014

Contributor

👍
I spend some thoughts around that and I think docker should restart a container at least if the command returned != 0.
To avoid crash looping, I would suggest to add a restart counter with some limit. This limit should be shown in the ps output and exposed via the api so it can be used for monitoring.

Contributor

discordianfish commented Jan 22, 2014

👍
I spend some thoughts around that and I think docker should restart a container at least if the command returned != 0.
To avoid crash looping, I would suggest to add a restart counter with some limit. This limit should be shown in the ps output and exposed via the api so it can be used for monitoring.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
Contributor

crosbymichael commented Jan 22, 2014

@sreuter

This comment has been minimized.

Show comment
Hide comment
@sreuter

sreuter Jan 25, 2014

+1,000 @discordianfish ... This would would be much more convenient than using supervisor for doing the job!

sreuter commented Jan 25, 2014

+1,000 @discordianfish ... This would would be much more convenient than using supervisor for doing the job!

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Jan 25, 2014

Collaborator

@discordianfish one thing I was wondering is how to prevent that for one shot commands that should not be restarted (and perhaps have side effects which even make it harmful to restart them).

On Sat, Jan 25, 2014 at 9:08 AM, sreuter notifications@github.com wrote:

+1,000 @discordianfish ... This would would be much more convenient than using supervisor for doing the job!

Reply to this email directly or view it on GitHub:
#26 (comment)

Collaborator

shykes commented Jan 25, 2014

@discordianfish one thing I was wondering is how to prevent that for one shot commands that should not be restarted (and perhaps have side effects which even make it harmful to restart them).

On Sat, Jan 25, 2014 at 9:08 AM, sreuter notifications@github.com wrote:

+1,000 @discordianfish ... This would would be much more convenient than using supervisor for doing the job!

Reply to this email directly or view it on GitHub:
#26 (comment)

@bfirsh

This comment has been minimized.

Show comment
Hide comment
@bfirsh

bfirsh Jan 25, 2014

Contributor

Some possible scenarios:

  1. running a one-off command in the foreground
  2. running a long-running command in the foreground
  3. running a one-off command as a daemon
  4. running a long-running command as a daemon

For (1), you almost definitely don't want it to auto-restart. For (2) you probably don't want to - you are probably running the command inside another process manager such as supervisord or upstart which will manage that for you.

For (3) you might want it to auto restart if it failed, but you would want it to be opt-in so you can confirm that the command is safe to run twice. For (4) you probably want it to auto-restart.

It feels like it should be an optional switch to docker run. I don't think it should be the default because there's no way to tell if a daemon is a long running process or not, and restarting a one-off process could potentially be dangerous. It probably shouldn't be tied to the -d flag because you might want to use it either in the foreground or as a daemon.

Contributor

bfirsh commented Jan 25, 2014

Some possible scenarios:

  1. running a one-off command in the foreground
  2. running a long-running command in the foreground
  3. running a one-off command as a daemon
  4. running a long-running command as a daemon

For (1), you almost definitely don't want it to auto-restart. For (2) you probably don't want to - you are probably running the command inside another process manager such as supervisord or upstart which will manage that for you.

For (3) you might want it to auto restart if it failed, but you would want it to be opt-in so you can confirm that the command is safe to run twice. For (4) you probably want it to auto-restart.

It feels like it should be an optional switch to docker run. I don't think it should be the default because there's no way to tell if a daemon is a long running process or not, and restarting a one-off process could potentially be dangerous. It probably shouldn't be tied to the -d flag because you might want to use it either in the foreground or as a daemon.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Jan 27, 2014

Contributor

I think we have the following options:

  1. restart every command returning != 0
  2. restart every command returning != 0 if running in background
  3. restart every command if flag X is given (maybe have another flag to restart only if returning !=0 )

If we want to prevent on shot commands in the background from being restarted by default even if they return != 0, we need go for 3).
I don't like that very much since usually you have long running jobs that should be restarted and forgetting about that option could be harmful as well. I guess the question is what's more likely:

  • People forgetting to explicitly make their containers restart and harm caused by them being down or
  • People forgetting that by default every command which exits != 0 will get restarted and harm caused by that

But either way this would be a big improvement.

Contributor

discordianfish commented Jan 27, 2014

I think we have the following options:

  1. restart every command returning != 0
  2. restart every command returning != 0 if running in background
  3. restart every command if flag X is given (maybe have another flag to restart only if returning !=0 )

If we want to prevent on shot commands in the background from being restarted by default even if they return != 0, we need go for 3).
I don't like that very much since usually you have long running jobs that should be restarted and forgetting about that option could be harmful as well. I guess the question is what's more likely:

  • People forgetting to explicitly make their containers restart and harm caused by them being down or
  • People forgetting that by default every command which exits != 0 will get restarted and harm caused by that

But either way this would be a big improvement.

@bfirsh

This comment has been minimized.

Show comment
Hide comment
@bfirsh

bfirsh Jan 27, 2014

Contributor

My mental model of using docker run without -d is as if I were just running a command in a shell. Automatically restarting would certainly confuse that understanding. These sorts of one-off commands are often repetitively run by a person at a command line and make no sense to automatically restart. They are going to forget to pass a flag.

Automatically restarting when running with -d and exit code != 0 could work. Though it'd need an extra flag to disable that behaviour.

It's also worth thinking about how containers automatically restarting will affect the initial experience of using Docker ("why does docker run ubuntu echo hello world never stop?"). Even for exit code != 0 ("why does my process infinitely print out errors?").

Contributor

bfirsh commented Jan 27, 2014

My mental model of using docker run without -d is as if I were just running a command in a shell. Automatically restarting would certainly confuse that understanding. These sorts of one-off commands are often repetitively run by a person at a command line and make no sense to automatically restart. They are going to forget to pass a flag.

Automatically restarting when running with -d and exit code != 0 could work. Though it'd need an extra flag to disable that behaviour.

It's also worth thinking about how containers automatically restarting will affect the initial experience of using Docker ("why does docker run ubuntu echo hello world never stop?"). Even for exit code != 0 ("why does my process infinitely print out errors?").

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jan 27, 2014

Contributor

The problem is that on the server side, where we do the restart, we don't know if you passed -d on the cli or not. Attach/detach is a cli only feature and does not change anything on the server side. It all looks the same.

If we were to say only restart processes that are meant to be demonized and returns with a non zero exit code, and that the user did not request to be stopped/killed/whatever, we would have to send a flag to the daemon saying that this container is meant to be attached or not.

Then we have the problem that you run a container without -d and detach and expect it to keep running and get restarted if it dies.

My vote is to have an --autorestart flag, similar to how you have to tell supervisor to autorestart your process if it dies, to tell docker to manage the container.

The default value for this will need to be decided, i think it should be false. I don't think it is a big deal to force a user to specify that they want this container to be auto restarted because it will reduce a lot more issue than if this were true you don't know why this container keeps on living forever.

Contributor

crosbymichael commented Jan 27, 2014

The problem is that on the server side, where we do the restart, we don't know if you passed -d on the cli or not. Attach/detach is a cli only feature and does not change anything on the server side. It all looks the same.

If we were to say only restart processes that are meant to be demonized and returns with a non zero exit code, and that the user did not request to be stopped/killed/whatever, we would have to send a flag to the daemon saying that this container is meant to be attached or not.

Then we have the problem that you run a container without -d and detach and expect it to keep running and get restarted if it dies.

My vote is to have an --autorestart flag, similar to how you have to tell supervisor to autorestart your process if it dies, to tell docker to manage the container.

The default value for this will need to be decided, i think it should be false. I don't think it is a big deal to force a user to specify that they want this container to be auto restarted because it will reduce a lot more issue than if this were true you don't know why this container keeps on living forever.

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Jan 27, 2014

Contributor

For auto restarting, I have to wonder whether relying on upstart/systemd to manage this wouldn't be a better option, and a docker command like "docker install -as <system.unit.filename>" wouldn't end up being the better path. Deciding to auto restart feels more like a system service option than a command line option - after all, isn't this what these system processes were designed to do?

Contributor

smarterclayton commented Jan 27, 2014

For auto restarting, I have to wonder whether relying on upstart/systemd to manage this wouldn't be a better option, and a docker command like "docker install -as <system.unit.filename>" wouldn't end up being the better path. Deciding to auto restart feels more like a system service option than a command line option - after all, isn't this what these system processes were designed to do?

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jan 28, 2014

Contributor

@smarterclayton We already have the ability to let process managers monitor the process. It was part of the host integration PR.

I am all for keeping docker as small as possible and integrating with external tools, loggers, process monitors, but I also can see the need for docker to provide good defaults and a feature set to users.

As for external process managers I'm using supervisor just fine and we also have a script to auto generate init and systemd configs in the contrib part of the repo.

[program:skydock]
command=docker start -a skydock
autostart=true
autorestart=true
stdout_logfile=/var/log/docker/skydock.log
redirect_stderr=true
numprocs=1
Contributor

crosbymichael commented Jan 28, 2014

@smarterclayton We already have the ability to let process managers monitor the process. It was part of the host integration PR.

I am all for keeping docker as small as possible and integrating with external tools, loggers, process monitors, but I also can see the need for docker to provide good defaults and a feature set to users.

As for external process managers I'm using supervisor just fine and we also have a script to auto generate init and systemd configs in the contrib part of the repo.

[program:skydock]
command=docker start -a skydock
autostart=true
autorestart=true
stdout_logfile=/var/log/docker/skydock.log
redirect_stderr=true
numprocs=1
@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Jan 28, 2014

Contributor

I guess I was thinking along a different line - allowing the client or daemon to translate an image definition directly into a systemd unit, such that at "start" time of the service docker is used solely to initialize the container namespace, but not be resident or active for the remainder of the service call. Some of the complexity I'm thinking of is systemd socket activation where docker acting as a network proxy adds a layer of indirection that isn't truly necessary, or needing the daemon to be active all the time (vs being only running on demand to create / commit containers).

Contributor

smarterclayton commented Jan 28, 2014

I guess I was thinking along a different line - allowing the client or daemon to translate an image definition directly into a systemd unit, such that at "start" time of the service docker is used solely to initialize the container namespace, but not be resident or active for the remainder of the service call. Some of the complexity I'm thinking of is systemd socket activation where docker acting as a network proxy adds a layer of indirection that isn't truly necessary, or needing the daemon to be active all the time (vs being only running on demand to create / commit containers).

@mbonano

This comment has been minimized.

Show comment
Hide comment
@mbonano

mbonano Feb 5, 2014

I'd like to +1 Michael Crosby's suggestion of adding an --autorestart option to the run command.

I currently use supervisor to monitor a container and restart that container if an exit code other than 0 is returned. I accomplish this by creating the container (docker run -n test-container some-image), stopping the container (docker stop test-container) and then starting the container via supervisor:

[program:test-container]
command=docker start -a test-container
autostart=true
autorestart=unexpected

The initial work of starting and stopping the container is the necessary evil that accompanies the use of a process manager in conjunction with docker. While this solutions "works" I would argue that "container management" is a sufficiently-unique animal that merits its own set of tooling.

The ideal solution would be the ability to create a container that auto-restarts. The container should be able to auto-restart if an exit code other than 0 is supplied as follows:

docker run -autorestart unexpected -name test-container my-image

The container should be able to always auto-restart by supplying "-autorestart true", and the container should default to "-autorestart false". This would, not only eliminate the need to introduce a process manager for container management, it would also be a much more efficient and elegant solution.

mbonano commented Feb 5, 2014

I'd like to +1 Michael Crosby's suggestion of adding an --autorestart option to the run command.

I currently use supervisor to monitor a container and restart that container if an exit code other than 0 is returned. I accomplish this by creating the container (docker run -n test-container some-image), stopping the container (docker stop test-container) and then starting the container via supervisor:

[program:test-container]
command=docker start -a test-container
autostart=true
autorestart=unexpected

The initial work of starting and stopping the container is the necessary evil that accompanies the use of a process manager in conjunction with docker. While this solutions "works" I would argue that "container management" is a sufficiently-unique animal that merits its own set of tooling.

The ideal solution would be the ability to create a container that auto-restarts. The container should be able to auto-restart if an exit code other than 0 is supplied as follows:

docker run -autorestart unexpected -name test-container my-image

The container should be able to always auto-restart by supplying "-autorestart true", and the container should default to "-autorestart false". This would, not only eliminate the need to introduce a process manager for container management, it would also be a much more efficient and elegant solution.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Feb 5, 2014

Contributor

@mbonano

If docker were to do this for simple cases I think that flag would work but I really am in favor of letting a supervisor do this because that is what they are made for.

Contributor

crosbymichael commented Feb 5, 2014

@mbonano

If docker were to do this for simple cases I think that flag would work but I really am in favor of letting a supervisor do this because that is what they are made for.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Feb 5, 2014

Contributor

@crosbymichael To me docker is (beside other things) a container supervisor. If we do things like restarting containers after reboot, we should also restart them on containers crashes.

Beside that, I imaging the docker api to be the only interaction point with a system:
Bootstrapping a system and installing docker is trivial and can be easily reproduced ( -> all system are the same). But managing supervisor config (container dependent state outside of a container) on a per host basis (on this host, those supervisor configs need to be written) causes exactly that configuration management nightmare I hope to escape :)

Contributor

discordianfish commented Feb 5, 2014

@crosbymichael To me docker is (beside other things) a container supervisor. If we do things like restarting containers after reboot, we should also restart them on containers crashes.

Beside that, I imaging the docker api to be the only interaction point with a system:
Bootstrapping a system and installing docker is trivial and can be easily reproduced ( -> all system are the same). But managing supervisor config (container dependent state outside of a container) on a per host basis (on this host, those supervisor configs need to be written) causes exactly that configuration management nightmare I hope to escape :)

@mbonano

This comment has been minimized.

Show comment
Hide comment
@mbonano

mbonano Feb 5, 2014

As docker evolves, I believe container management will become an increasing concern, especially if functionality like cluster support is to ever exist in the technology. The day where a cluster of containers running across several docker host machines is supported, the supervisor solution becomes completely unmanageable. Until then, I will have to write a home-grown container management solution and hope that the tooling one day renders my solution obsolete.

mbonano commented Feb 5, 2014

As docker evolves, I believe container management will become an increasing concern, especially if functionality like cluster support is to ever exist in the technology. The day where a cluster of containers running across several docker host machines is supported, the supervisor solution becomes completely unmanageable. Until then, I will have to write a home-grown container management solution and hope that the tooling one day renders my solution obsolete.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Feb 18, 2014

Contributor

Any proposal for a flag name? --autorestart ?

Contributor

crosbymichael commented Feb 18, 2014

Any proposal for a flag name? --autorestart ?

@mbonano

This comment has been minimized.

Show comment
Hide comment
@mbonano

mbonano Feb 18, 2014

+1 for --autorestart

mbonano commented Feb 18, 2014

+1 for --autorestart

@sreuter

This comment has been minimized.

Show comment
Hide comment
@sreuter

sreuter Feb 18, 2014

+1 from the other side of the table ;-)

sreuter commented Feb 18, 2014

+1 from the other side of the table ;-)

@jsdir

This comment has been minimized.

Show comment
Hide comment
@jsdir

jsdir Feb 18, 2014

Contributor

+1 for --autorestart. Since docker already has the ability to restart containers on machine reboot, restarting containers on unexpected termination fits nicely under the existing management responsibilities of the daemon.

Contributor

jsdir commented Feb 18, 2014

+1 for --autorestart. Since docker already has the ability to restart containers on machine reboot, restarting containers on unexpected termination fits nicely under the existing management responsibilities of the daemon.

@tt

This comment has been minimized.

Show comment
Hide comment
@tt

tt Feb 19, 2014

Overly pedantic, perhaps, but how about --auto-restart?

tt commented Feb 19, 2014

Overly pedantic, perhaps, but how about --auto-restart?

@tianon

This comment has been minimized.

Show comment
Hide comment
@tianon

tianon Feb 19, 2014

Member

The pedant in me also says +1 to that :)

Member

tianon commented Feb 19, 2014

The pedant in me also says +1 to that :)

@mahnunchik

This comment has been minimized.

Show comment
Hide comment

+1

@garthk

This comment has been minimized.

Show comment
Hide comment
@garthk

garthk Feb 23, 2014

+1 to --auto-?restart; I'm more worried about ensuring it defaults to false, even if -d was given, than whether or not it has a dash in the name.

Can we roll in contrib/host-integration/manager somehow?

--auto-restart=false # default
--auto-restart=true # docker re-starts the container if it exits without docker being involved
--auto-restart=supervisord # generate supervisord script and rely on that instead
--auto-restart=upstart # …
--auto-restart=initd # …

garthk commented Feb 23, 2014

+1 to --auto-?restart; I'm more worried about ensuring it defaults to false, even if -d was given, than whether or not it has a dash in the name.

Can we roll in contrib/host-integration/manager somehow?

--auto-restart=false # default
--auto-restart=true # docker re-starts the container if it exits without docker being involved
--auto-restart=supervisord # generate supervisord script and rely on that instead
--auto-restart=upstart # …
--auto-restart=initd # …
@ttahmouch

This comment has been minimized.

Show comment
Hide comment
@ttahmouch

ttahmouch Apr 18, 2014

Supervisord is great for process management. However, I can see auto restart being useful for something Supervisord can't assist with.

I'd really like to be able to limit the memory of a container, and have the container restart if the limit has been reached. Currently, when the memory limit is reached, the containers just exit with a non-zero status code. Supervisord wouldn't be able to restart processes in this case because Supervisord itself stops running.

Perhaps I'm doing something wrong?

Supervisord is great for process management. However, I can see auto restart being useful for something Supervisord can't assist with.

I'd really like to be able to limit the memory of a container, and have the container restart if the limit has been reached. Currently, when the memory limit is reached, the containers just exit with a non-zero status code. Supervisord wouldn't be able to restart processes in this case because Supervisord itself stops running.

Perhaps I'm doing something wrong?

@ttahmouch

This comment has been minimized.

Show comment
Hide comment
@ttahmouch

ttahmouch Apr 18, 2014

I just read some people's posts about managing individual containers outside of Docker using Supervisord on the host. I wasn't aware that was possible. Even so, that seems "hacky."

I just read some people's posts about managing individual containers outside of Docker using Supervisord on the host. I wasn't aware that was possible. Even so, that seems "hacky."

@pikeas

This comment has been minimized.

Show comment
Hide comment
@pikeas

pikeas Apr 29, 2014

+1 to this. I'm evaluating Docker and tooling/monitoring around crashing containers is a requirement for any serious production use.

pikeas commented Apr 29, 2014

+1 to this. I'm evaluating Docker and tooling/monitoring around crashing containers is a requirement for any serious production use.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jun 3, 2014

Contributor

+1
--auto-restart=on-failure/always/never

Contributor

cpuguy83 commented Jun 3, 2014

+1
--auto-restart=on-failure/always/never

@sirwolfgang

This comment has been minimized.

Show comment
Hide comment
@sirwolfgang

sirwolfgang Jun 16, 2014

+1 --auto-restart

+1 --auto-restart

@NitroPye

This comment has been minimized.

Show comment
Hide comment
@NitroPye

NitroPye Jun 16, 2014

+1 --auto-restart

+1 --auto-restart

@fx

This comment has been minimized.

Show comment
Hide comment
@fx

fx Jun 17, 2014

+1 --auto-restart --recover?

fx commented Jun 17, 2014

+1 --auto-restart --recover?

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Jun 20, 2014

Contributor

Implementation detail: I believe it should restart on top of a new layer.

Contributor

ewindisch commented Jun 20, 2014

Implementation detail: I believe it should restart on top of a new layer.

@tomislacker

This comment has been minimized.

Show comment
Hide comment
@tomislacker

tomislacker Jun 23, 2014

Agreed with @ewindisch that it should restart on top of a new layer.

I feel like this issue would be kind of going down a slippery slow though. We need to have sentinel values to control how many times it can restart within a defined timerange so that we don't let our hosts go nuking themselves.

Agreed with @ewindisch that it should restart on top of a new layer.

I feel like this issue would be kind of going down a slippery slow though. We need to have sentinel values to control how many times it can restart within a defined timerange so that we don't let our hosts go nuking themselves.

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Jun 23, 2014

Contributor

I think there's valid reason to do a simple restart and to do a clean new
layer. We use 'restart' for our restart loops and it has the nice property
of not littering the disk with old layers.

And yes, it is a bit of a slippery slope in the longer term, though I think
we could get away with a very naive implementation for now.

On Mon, Jun 23, 2014 at 9:26 AM, Ben Tomasik notifications@github.com
wrote:

Agreed with @ewindisch https://github.com/ewindisch that it should
restart on top of a new layer.

I feel like this issue would be kind of going down a slippery slow though.
We need to have sentinel values to control how many times it can restart
within a defined timerange so that we don't let our hosts go nuking
themselves.

Reply to this email directly or view it on GitHub
#26 (comment).

Contributor

thockin commented Jun 23, 2014

I think there's valid reason to do a simple restart and to do a clean new
layer. We use 'restart' for our restart loops and it has the nice property
of not littering the disk with old layers.

And yes, it is a bit of a slippery slope in the longer term, though I think
we could get away with a very naive implementation for now.

On Mon, Jun 23, 2014 at 9:26 AM, Ben Tomasik notifications@github.com
wrote:

Agreed with @ewindisch https://github.com/ewindisch that it should
restart on top of a new layer.

I feel like this issue would be kind of going down a slippery slow though.
We need to have sentinel values to control how many times it can restart
within a defined timerange so that we don't let our hosts go nuking
themselves.

Reply to this email directly or view it on GitHub
#26 (comment).

@tleyden

This comment has been minimized.

Show comment
Hide comment
@tleyden

tleyden Jun 23, 2014

I think you'd want to take a close look at Erlang supervisors and try to crib as much as possible.

The Erlang folks have done a ton of work on pretty much exactly this same problem. (Fault tolerance)

The supervisor config allows full control over the restart behavior -- for example the max number of attempts within specified time window, and giving up once those attempts have been exhausted.

tleyden commented Jun 23, 2014

I think you'd want to take a close look at Erlang supervisors and try to crib as much as possible.

The Erlang folks have done a ton of work on pretty much exactly this same problem. (Fault tolerance)

The supervisor config allows full control over the restart behavior -- for example the max number of attempts within specified time window, and giving up once those attempts have been exhausted.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Jun 24, 2014

Contributor

I would just suggest a exponential backoff and a restart counter you can use for monitoring. Defining time windows and how many restarts etc seem overkill. If you need that, you can still use a orchestration tool to implement that.

I agree that a new layer would be cool, but I don't think it's required to add restarts since there are much more open questions to discuss for making that happen.

Contributor

discordianfish commented Jun 24, 2014

I would just suggest a exponential backoff and a restart counter you can use for monitoring. Defining time windows and how many restarts etc seem overkill. If you need that, you can still use a orchestration tool to implement that.

I agree that a new layer would be cool, but I don't think it's required to add restarts since there are much more open questions to discuss for making that happen.

@sirwolfgang

This comment has been minimized.

Show comment
Hide comment
@sirwolfgang

sirwolfgang Jun 24, 2014

At this point, I would say lets get some basic auto restarting going. Then make it better in iterations. No point in trying to improve something that doesn't exists.

At this point, I would say lets get some basic auto restarting going. Then make it better in iterations. No point in trying to improve something that doesn't exists.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Jun 24, 2014

Contributor

@crosbymichael: Do you agree that we want this feature? And do you have already a rough idea how ti implement that? I can execute on that but some pointers how you would like to have it implemented will save us some back- and forth I guess :)

Contributor

discordianfish commented Jun 24, 2014

@crosbymichael: Do you agree that we want this feature? And do you have already a rough idea how ti implement that? I can execute on that but some pointers how you would like to have it implemented will save us some back- and forth I guess :)

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jun 24, 2014

Contributor

We are still working on what a single runtime per container will look like but I think if we were to implement this type of functionality it should go in the runtime. This is because you just want to restart the process inside the container and not have to go through the entire create phase and rootfs setup.

Contributor

crosbymichael commented Jun 24, 2014

We are still working on what a single runtime per container will look like but I think if we were to implement this type of functionality it should go in the runtime. This is because you just want to restart the process inside the container and not have to go through the entire create phase and rootfs setup.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Jun 24, 2014

Contributor

@crosbymichael Let me know if you want me to work one that. Given how fast things move I hesitate to work on something that will be obsolete and unmergable a day later :)

Contributor

discordianfish commented Jun 24, 2014

@crosbymichael Let me know if you want me to work one that. Given how fast things move I hesitate to work on something that will be obsolete and unmergable a day later :)

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jun 24, 2014

Contributor

@discordianfish I think the best thing would be to write up a spec on how the runtime should work and what operations it will have. Let me look at my gists and see if I can find my notes

Contributor

crosbymichael commented Jun 24, 2014

@discordianfish I think the best thing would be to write up a spec on how the runtime should work and what operations it will have. Let me look at my gists and see if I can find my notes

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607

bgrant0607 Jun 24, 2014

I agree with those advocating 3 restart policies: always (services), on failure (batch tasks), never (tests, commands). More complex logic (e.g., backoff, giving up after N tries, killing/restarting groups of containers) should be punted to a higher layer.

I agree with those advocating 3 restart policies: always (services), on failure (batch tasks), never (tests, commands). More complex logic (e.g., backoff, giving up after N tries, killing/restarting groups of containers) should be punted to a higher layer.

@bgrant0607 bgrant0607 referenced this issue in kubernetes/kubernetes Jun 27, 2014

Closed

Configurable restart behavior #127

@bfirsh bfirsh referenced this issue in docker/compose Jun 30, 2014

Closed

feature request: high availability mode #275

crosbymichael added a commit that referenced this issue Jul 8, 2014

@jpanganiban

This comment has been minimized.

Show comment
Hide comment
@jpanganiban

jpanganiban Jul 18, 2014

I'm working with Ubuntu 14.04 and Docker 1.1.1 hosted on AWS.

I have the --restart=false flag in my daemon (configured under /etc/defaults/docker)

DOCKER_OPTS="--restart=false ${DOCKER_OPTS}"

I then ran this command

docker run --rm=true --name test ubuntu:14.04 /bin/sh -c "while true; do echo hello world; sleep 1; done"

and then issued a reboot from the AWS panel. And then ran $ docker ps -a and have this:

bde1fa320748        ubuntu:14.04        /bin/sh -c 'while tr   7 minutes ago       Exited (0) 2 minutes ago                       test

My questions are:

  1. Should the container have persisted after the restart (even with the --restart=false flag)? What's the expected behavior for this?
  2. @crosbymichael: I'm also using supervisor to manage my containers. I usually delete containers that exited and spin-off a new one based on an image that's why I have a docker run <image> <command> instead of a docker start -a <container> in my command parameter. Is this a bad approach I'm doing? Right now, it's not working for me since the containers still persist even after a reboot.

I'm working with Ubuntu 14.04 and Docker 1.1.1 hosted on AWS.

I have the --restart=false flag in my daemon (configured under /etc/defaults/docker)

DOCKER_OPTS="--restart=false ${DOCKER_OPTS}"

I then ran this command

docker run --rm=true --name test ubuntu:14.04 /bin/sh -c "while true; do echo hello world; sleep 1; done"

and then issued a reboot from the AWS panel. And then ran $ docker ps -a and have this:

bde1fa320748        ubuntu:14.04        /bin/sh -c 'while tr   7 minutes ago       Exited (0) 2 minutes ago                       test

My questions are:

  1. Should the container have persisted after the restart (even with the --restart=false flag)? What's the expected behavior for this?
  2. @crosbymichael: I'm also using supervisor to manage my containers. I usually delete containers that exited and spin-off a new one based on an image that's why I have a docker run <image> <command> instead of a docker start -a <container> in my command parameter. Is this a bad approach I'm doing? Right now, it's not working for me since the containers still persist even after a reboot.
@mkb

This comment has been minimized.

Show comment
Hide comment
@mkb

mkb Jul 25, 2014

Should process monitoring and restarting be part of Docker's duties or does that mean reinventing the wheel? Might it be better to have Docker play nicely with existing tools like Monit/Systemd/God/etc rather than creating one more way to do things?

mkb commented Jul 25, 2014

Should process monitoring and restarting be part of Docker's duties or does that mean reinventing the wheel? Might it be better to have Docker play nicely with existing tools like Monit/Systemd/God/etc rather than creating one more way to do things?

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 25, 2014

Contributor

Closing this as we are doing a proposal for auto restart in #7226

Contributor

crosbymichael commented Jul 25, 2014

Closing this as we are doing a proposal for auto restart in #7226

@tobegit3hub tobegit3hub referenced this issue in tobegit3hub/seagull Dec 12, 2014

Closed

Restart containers like supervisord #28

dm0- pushed a commit to dm0-/docker that referenced this issue Sep 21, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment