New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Auto restart containers #7226

Closed
crosbymichael opened this Issue Jul 24, 2014 · 68 comments

Comments

Projects
None yet
@crosbymichael
Copy link
Member

crosbymichael commented Jul 24, 2014

Auto restart of containers

This should be a fairly simple and quick change to make for docker. We introduce a flag and a simple restart policy for containers based on their error codes. We will not do complex relationship or restart strategies. If you need this then we integration with other process supervisors.

--restart flag

The --restart flag will be passed on docker run and be included for a container in it's hostconfig. This flag will default to a policy of none.

Restart Policies

no - Do not restart the container if it dies.

on-failure - Restart the container if it exists with a non-zero exit code.

always - Always restart the container no matter what exit code is returned.

For policies that restart a container we will use a 1sec backoff for the restarts. By default the on-failure and always strategies will try to restart the container forever unless the user specifies a maximum limit of retries. The restart count can be stored in the container.

@jamtur01

This comment has been minimized.

Copy link
Contributor

jamtur01 commented Jul 24, 2014

Just ideas.

  1. Flag available for docker build to hardcode restart behaviour?
  2. Flag available at daemon level?
@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 24, 2014

@jamtur01 definitely not for build. This seems like a very deployment specific issue and if it's hard coded into the image then anyone wanting to use supervisor, upstart, or systemd cannot use the image.

How would the daemon flag look?

@jamtur01

This comment has been minimized.

Copy link
Contributor

jamtur01 commented Jul 25, 2014

Like the docker run flag but specify an overall default.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 25, 2014

+1 this is exactly what I want.

@pandrew

This comment has been minimized.

Copy link
Contributor

pandrew commented Jul 25, 2014

+1

@tianon

This comment has been minimized.

Copy link
Member

tianon commented Jul 25, 2014

non-zero - Restart the container if it exists with a non-zero exit code.

So, if I docker stop my-container and it shuts down cleanly, it won't be restarted, but if I docker kill my-container it will be restarted? Are you thinking a special case for user-initiated (via Docker obviously) container stopping events so that they don't get the autorestart?

@mahnunchik

This comment has been minimized.

Copy link

mahnunchik commented Jul 25, 2014

+1

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 25, 2014

@tianon explicit stops and kills from the docker API do not count

@vishh

This comment has been minimized.

Copy link
Contributor

vishh commented Jul 25, 2014

@crosbymichael Why isn't the number of restarts configurable?

@tomislacker

This comment has been minimized.

Copy link

tomislacker commented Jul 25, 2014

+1

@tianon

This comment has been minimized.

Copy link
Member

tianon commented Jul 25, 2014

@crosbymichael SGTM ❤️

@radicalray

This comment has been minimized.

Copy link

radicalray commented Jul 25, 2014

👍

@discordianfish

This comment has been minimized.

Copy link
Contributor

discordianfish commented Jul 27, 2014

Awesome! 👍

@aminjam

This comment has been minimized.

Copy link

aminjam commented Jul 28, 2014

👍

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 28, 2014

ping @vieux @shykes

what do you think? if it LGTY i'll start implementation

@vieux

This comment has been minimized.

Copy link
Collaborator

vieux commented Jul 28, 2014

@crosbymichael really cool SGTM

@bfirsh

This comment has been minimized.

Copy link
Contributor

bfirsh commented Jul 28, 2014

Generally, 👍

I'm a bit concerned by having a fixed max 10 retries. If this is a production system on a process I know doesn't have any side effects (e.g. web server connecting a database that has gone down), I want it to keep on retrying forever. Just a thought though, this can easily be added later.

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 28, 2014

@bfirsh what would you think is better for this?

--restart always:20
or
--restart always --restart-retry 20
--restart always --restart-max-retry 20

Any other ideas?

@vishh

This comment has been minimized.

Copy link
Contributor

vishh commented Jul 28, 2014

@crosbymichael: +1 for --restart always:20

On Mon Jul 28 2014 at 1:14:04 PM Michael Crosby notifications@github.com
wrote:

@bfirsh https://github.com/bfirsh what would you think is better for
this?

--restart always:20
or
--restart always --restart-retry 20


Reply to this email directly or view it on GitHub
#7226 (comment).

@bfirsh

This comment has been minimized.

Copy link
Contributor

bfirsh commented Jul 28, 2014

@bfirsh

This comment has been minimized.

Copy link
Contributor

bfirsh commented Jul 28, 2014

I'm pretty sure that in most cases if you don't mind a process being restarted, you don't mind it being restarted indefinitely with a sensible decay. The design would be much simpler then. If we do get users requesting being able to specify the number of times it was restarted, we could add a --restart-limit option or something.

To bikeshed the names, I prefer the systemd options:

--restart=(no|on-failure|always)

none is not the opposite of always and non-zero feels like a leaky abstraction.

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 28, 2014

--restart=(no|on-failure|always) sounds good to me, i'll make the change to the proposal.

I'm not sure I feel comfortable with a restart forever, it could be an easy way to ddos a machine if you have many containers and something in the underlying host changes.

@bfirsh

This comment has been minimized.

Copy link
Contributor

bfirsh commented Jul 28, 2014

Okay. I think I prefer a separate --restart-limit option like upstart, with an integer argument or unlimited. It seems clearer than mushing it into one option.

For reference: systemd restarts an unlimited number of times with 100ms delay by default, upstart is 10 times with 5s delay.

@paimpozhil

This comment has been minimized.

Copy link

paimpozhil commented Jul 28, 2014

+1 👍

@discordianfish

This comment has been minimized.

Copy link
Contributor

discordianfish commented Jul 29, 2014

@bfirsh 👍 on that. I would want unlimited restarts and monitor the restart counts so we should make that available too.

@crosbymichael Does a restart counter sounds reasonable to you? Could be included in the container config so I can monitor that. If so, could you add it to the proposal as well?

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jul 29, 2014

I really want to keep the scope small. That is why I have been leery about implementing this in docker because everyone wants more and more.

I think unlimited is ok for a default and a counter saved in the container should be very small change. I'll update the proposal with both.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Aug 14, 2014

@therealprologic It's in! #7414

@prologic

This comment has been minimized.

Copy link
Contributor

prologic commented Aug 14, 2014

Thanks :)

James Mills / prologic

E: prologic@shortcircuit.net.au
W: prologic.shortcircuit.net.au

On Fri, Aug 15, 2014 at 6:10 AM, Brian Goff notifications@github.com
wrote:

@therealprologic https://github.com/therealprologic It's in! #7414
#7414


Reply to this email directly or view it on GitHub
#7226 (comment).

@stuartpb

This comment has been minimized.

Copy link

stuartpb commented Aug 28, 2014

So was the daemon -r flag removed? It's still referenced in https://docs.docker.com/articles/host_integration/

@prologic

This comment has been minimized.

Copy link
Contributor

prologic commented Aug 28, 2014

It's been deprecated in favor of restart polorices
via the docker run --restart option.

This was brought in in Docker 1.2.0 afaik

cheers
James

James Mills / prologic

E: prologic@shortcircuit.net.au
W: prologic.shortcircuit.net.au

On Thu, Aug 28, 2014 at 6:28 PM, Stuart P. Bentley <notifications@github.com

wrote:

So was the daemon -r flag removed? It's still referenced in
https://docs.docker.com/articles/host_integration/


Reply to this email directly or view it on GitHub
#7226 (comment).

@discordianfish

This comment has been minimized.

Copy link
Contributor

discordianfish commented Aug 28, 2014

Right, but the docs are outdated. The API docs also don't mention how to set the restart policy.

@discordianfish

This comment has been minimized.

Copy link
Contributor

discordianfish commented Aug 28, 2014

@crosbymichael I also can't find the restart counter. Is it somewhere or did we miss that?

@stela5

This comment has been minimized.

Copy link

stela5 commented Sep 27, 2014

@discordianfish

This comment has been minimized.

Copy link
Contributor

discordianfish commented Sep 27, 2014

@stela5 Guess you mean me? That's a restart limit, I'm talking about a counter to figure out how often a given container was (auto)restarted. That's a important metric you should monitor when using that feature. Not sure if it was a misunderstanding or it was just forgotten but I can't find such counter anywhere. @crosbymichael can you confirm?

@stela5

This comment has been minimized.

Copy link

stela5 commented Sep 30, 2014

@discordianfish Sorry, my mistake.

@revett

This comment has been minimized.

Copy link

revett commented Jan 15, 2015

API docs for docker run still don't include any reference to --restart.

https://github.com/docker/docker/blob/master/docs/sources/reference/run.md

@yatskevich

This comment has been minimized.

Copy link

yatskevich commented Jan 18, 2015

@revett man page has it described.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

Just a question:

I need to autorestart a docker container without restarting docker daemon. It is possible?

@prologic

This comment has been minimized.

Copy link
Contributor

prologic commented Jul 15, 2016

autodock-cron1 will automaically restart containers on a schedule based
on a cron expression.

cheers
James

James Mills / prologic

E: prologic@shortcircuit.net.au
W: prologic.shortcircuit.net.au

On Fri, Jul 15, 2016 at 8:36 AM, ldesi notifications@github.com wrote:

Just a question:

I need to autorestart a docker container without restarting docker daemon.
It is possible?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7226 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABOv-mpeM2dkYVRdJV79fHjFCwzgF9Frks5qV6kOgaJpZM4CQqoc
.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

Sorry, maybe I didn't make that clear.
With the --restart policies we can decide to restart a container on failures for example.
If I kill a container, then I must restart docker daemon to trigger docker container restart.
I would want to restart the failed container without docker daemon restarting.
Moreover, the cron daemon has a resolution of 1 minute, but I need at least seconds resolution.
There's some solutions?

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 15, 2016

If you kill a container you just call "docker start" on it again.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

HI @cpuguy83 , unfortunately I need to automatically restart the killed container, but without restarting docker daemon.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 15, 2016

@ldesi You'd need to give your full use-case.
But if you kill a container, it's up to you to bring it back up... I'm not sure why docker would auto-restart in this case, especially since you have full control.

If you instead kill the process inside the container, the restart policy will kick in.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

@cpuguy83 and @prologic thanks for responses.
I've discovered that with SIGKILL signal the container will not be restarted until docker daemon restart. Trying with SIGSEGV the container is normally restarted.
I'm just expecting that using docker kill which use SIGKILL by default the container would be restarted. I don't know if it is a precise policy adopted by the docker daemon and docker developers.
Thank you again.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 15, 2016

@ldesi Doing docker kill or docker stop effectively disables the restart policy. You've called docker kill, you can just as easily start the container back up... docker kill foo && docker start foo.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

If you use docker kill to simulate a container crash (one process within the container crashed and may crash the container as a whole) one is expecting that restart policies work. I agree with docker stop that use SIGTERM, but I don't understand why sending SIGKILL is not considered a "container crash"

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 15, 2016

@ldesi docker kill does not simulate a container crash, this is a request going through Docker (and not even necessarily SIGKILL).
You can send SIGKILL directly to the process to simulate a crash.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

@cpuguy83 from docker kill doc (https://docs.docker.com/engine/reference/commandline/kill/):

The main process inside the container will be sent SIGKILL, or any signal specified with option --signal.

which is the same if you: kill -9 $CONTAINER_INIT_PID

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jul 15, 2016

@ldesi It's not the same, one is a docker command, the other is a system command.

Again, if you've told Docker to stop the container in some way, it assumes that you can start it back up yourself. I believe this is the correct behavior, it's also not an API that can really be changed without breaking people.

@ldesi

This comment has been minimized.

Copy link

ldesi commented Jul 15, 2016

@cpuguy83 yes of course, but I suppose that the docker command eventually will call kill command or a syscall. But I agree with you, maybe is better to separate conceptually :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment