Auto Restarting crashed apps #216

jwarzech · 2013-09-16T10:57:54Z

Is there a proposed solution for monitoring dokku apps and auto restarting if the crash? Would just setting up something like http://godrb.com/ work?

statianzo · 2013-09-16T18:26:30Z

When putting together the dokku-shoreman plugin, I was thinking that supervisord would be a good candidate for monitoring apps. There's no ruby requirement and it's available in the ubuntu repos. Translating from a Procfile to a supervisord conf, like foreman does, wouldn't be bad either.

ghost · 2013-09-16T20:31:36Z

I've been toying with RUnit, it's already working well with Docker, but I didn't get very far yet regarding integration with Dokku.

jwarzech · 2013-09-19T13:27:10Z

Thanks for the suggestions, I'm starting to take a look into supervisord and runit, are there any good tutorials or examples out there to point me in the right direction with having it monitor a dokku/docker container?

statianzo · 2013-09-23T22:32:56Z

@jwarzech I put together a supervisord runner plugin. Try it out if you'd like. https://github.com/statianzo/dokku-supervisord

alexbeletsky · 2013-09-25T05:27:58Z

Thats great plugin @statianzo .. I'm currently thinking the way to integrate pm2, it's specialised for node.js apps, but it does job well. Would be happy if you have any suggestions how to do that plug.

statianzo · 2013-09-25T15:30:09Z

@alexanderbeletsky Thanks. You could take a similar approach of generating a processes.json from a procfile. If you need more complicated behavior, you could support detecting a processes.json that already exists in the source before generating and use that (I was thinking about adding that to the supervisord plugin).

alexbeletsky · 2013-09-25T17:01:31Z

awesome suggestions! I just playing pm2 locally now, to understand it
well.. as soon I'm there, try to pack the pluging.

On Wed, Sep 25, 2013 at 6:30 PM, Jason Staten notifications@github.comwrote:

@alexanderbeletsky https://github.com/alexanderbeletsky Thanks. You
could take a similar approach of generating a processes.json from a
procfile. If you need more complicated behavior, you could support
detecting a processes.json that already exists in the source before
generating and use that (I was thinking about adding that to the
supervisord plugin).

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-25097272
.

Alexander Beletsky,
http://beletsky.net

progrium · 2013-09-25T20:51:05Z

My initial thought is that it should be managed with Upstart. When an app is deployed, an Upstart job is created for each process. Then it's a matter of exposing the management of them via dokku commands (or at least the common operations)

ghost · 2013-09-25T21:15:59Z

My initial thought is that it should be managed with Upstart. When an app is deployed, an Upstart job is created for each process. Then it's a matter of exposing the management of them via dokku commands (or at least the common operations)

I'm personally more in favor of Runit, but that's mainly because I have experience with it. What I'd love to see is a refactoring of the app creation/restarting that caters for different process managements plugins.

progrium · 2013-09-25T21:25:08Z

Well there is an assumption of Ubuntu and Upstart is the Ubuntu way to do
this. And we're already using it for Nginx. But you're right in that we
should try and do it as a plugin so that people can make their own if they
wish.

On Wed, Sep 25, 2013 at 4:16 PM, Lars Gierth notifications@github.comwrote:

My initial thought is that it should be managed with Upstart. When an app
is deployed, an Upstart job is created for each process. Then it's a matter
of exposing the management of them via dokku commands (or at least the
common operations)

I'm personally more in favor of Runit, but that's mainly because I have
experience with it. What I'd love to see is a refactoring of the app
creation/restarting that caters for different process managements plugins.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-25125492
.

Jeff Lindsay
http://progrium.com

andreypopp · 2013-09-27T00:10:25Z

I've integrated upstart with my fork of dokku — μPaaS (basically dokku with plain Dockerfiles and Makefiles instead of Heroku buildpacks to specify stack and build operations, everything else is just Dokku). All integration is in a single commit — andreypopp/upaas@54569cc

It's a little sophisticated cause I don't use PID-1 upstart for security reasons and instead spawn a new upstart session specifically for git user.

As a bonus point I get easy start/stop/restart commands

sudo -i -u git initctl --user start|stop|restart <myapp>

I think it would be easy to backport to dokku if necessary.

asm89 · 2013-10-24T21:00:31Z

Added this as potential improvement for 0.3.0 for now. :)

pnegahdar · 2013-10-24T22:54:23Z

Can we get a discussion on why supervisor shouldn't be the default deploy mechanism for all dokku apps? See my pull request on buildstep to solve this here: progrium/buildstep#43

A couple reasons for it:

It runs bash so should make no difference in any env/language
Autorestart to keep apps alive
Multiple processes (web, workers, tasks, schedulers, etc) similar to heroku.

progrium · 2013-10-24T22:58:13Z

First, upstart is preferred because we're targeting Ubuntu. Second, Heroku
doesn't actually automatically start non web processes. You have to start
them manually afaik.

But autorestarting is great.

On Thu, Oct 24, 2013 at 5:54 PM, Parham Negahdar
notifications@github.comwrote:

Can we get a discussion on why supervisor shouldn't be the default deploy
mechanism for all dokku apps? See my pull request on buildstep to solve
this here: progrium/buildstep#43 progrium/buildstep#43

A couple reasons for it:

It runs bash so should make no difference in any env/language

Autorestart to keep apps alive

Multiple processes (web, workers, tasks, schedulers, etc) similar to
heroku.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-27039860
.

Jeff Lindsay
http://progrium.com

pnegahdar · 2013-10-24T23:08:16Z

@progrium hmm Ill look into upstart.

I think thats simply an issue of billing (first dyno free, worker processes = 2nd dyno) which is why they force you to start to acknowledge the fact that you're going to be paying for that second dyno. Perhaps we're going to keep things herokuish (see #225) but in this seems way out of that scope.

progrium · 2013-10-24T23:13:29Z

What's out of scope?

On Thu, Oct 24, 2013 at 6:08 PM, Parham Negahdar
notifications@github.comwrote:

@progrium https://github.com/progrium hmm Ill look into upstart.

I think thats simply an issue of billing (first dyno free, worker
processes = 2nd dyno) which is why they force you to start to acknowledge
the fact that you're going to be paying for that second dyno. Perhaps we're
going to keep things herokuish (see #225 #225)
but in this seems way out of that scope.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-27041052
.

Jeff Lindsay
http://progrium.com

ElliotChong · 2014-04-28T20:27:10Z

Was there ever any official progress on the issue of auto-restarting crashed apps?

pnegahdar · 2014-04-28T22:11:02Z

dokku-supervisord is a great solution and actively maintained. https://github.com/statianzo/dokku-supervisord

ElliotChong · 2014-05-17T03:35:29Z

Thanks @pnegahdar!

jaouadk · 2014-05-17T17:39:35Z

Here is a much more useful supervisor plugin providing Individual app logs on the host https://github.com/sehrope/dokku-logging-supervisord
it's based on the plugin mentionned by Pnegahdar

alanjds · 2015-09-11T20:16:19Z

I remember Heroku to cease an desist of your app if it crashes and restart too often. I kind of liked this behavior in fact...

tmikoss · 2015-09-17T04:46:46Z

What about dockers own --restart=always run option? Would that interfere in any way with how dokku handles the containers?

josegonzalez · 2015-09-17T04:53:40Z

@michaelshobbs We've definitely discussed this before, but I don't remember why we haven't added --restart=always. I assume thats because it conflicts with the zero-downtime checks, but I can't be sure. It seems as though we verify that the same container is still up, so maybe we can safely add it. It doesn't appear that we terminate the container on failure though, which we might consider a bug... Thoughts?

Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely.

michaelshobbs · 2015-09-17T05:06:50Z

Yeah the default zero downtime check relies on containers exiting on error.

josegonzalez · 2015-09-17T05:15:09Z

We can probably get around that by checking the number of restarts?

restarts=$(docker inspect -f "{{ .RestartCount }}" $DOKKU_APP_CONTAINER_ID)
[[ $restarts -ne 0 ]] && dokku_log_fail "App container failed to start!!"

Thoughts?

josegonzalez · 2015-09-17T05:16:46Z

Note: that would leave the container restarting 5ever, so we probably want to at least call docker stop as well.

josegonzalez · 2015-09-17T05:17:46Z

This could work:

container_restarts=$(docker inspect -f "{{ .RestartCount }}" $DOKKU_APP_CONTAINER_ID)
if [[ $container_restarts -ne 0 ]]; then
  docker stop "$DOKKU_APP_CONTAINER_ID" || true
  dokku_log_fail "App container failed to start!!"
fi

tmikoss · 2015-09-17T05:19:31Z

Maybe --restart=on-failure:XX would work, where XX is a reasonabe limit for retries? Though I was not able to find whether the limit gets reset after a container enters 'stable' state.

My use case as an app that dies maybe once in a couple of days due to extraordinary coincidences.

Edit: restart=always plus check for container restarts seems to be a better solution, as it covers more crash situations (like app repeatedly dying due to network errors until connectivity gets restored).

josegonzalez · 2015-09-17T05:25:10Z

Dokku doesn't have an agent, so unfortunately we can't do smart crash restart policies like heroku does.

Perhaps the --restart=on-failure:N is a good alternative. I'll see what I can come up with.

Previously a crashed container would stay down, regardless of exit status. In some cases, it may be useful to restart the container. For example, an application may not be correctly implementing their error handling, or the crash may be caused by a transient error. By setting the restart policy to `on-failure:N` - where N is a number of max restarts - we can help developers guard against crashing applications. Note that this is not a replacement for proper error handling, nor does this include notifications to a developer when a container is restarted. Those patterns should be implemented application side, or via a feature request to docker. The value is configurable at the app-level by setting DOKKU_RESTART_LIMIT to a number. By default, containers will be restarted a max of 10 times. If a container crashes during the check-deploy plugin trigger, then the deploy will be marked as a failure. Closes #216 Closes #398 Closes #1327

christiangenco · 2016-02-04T01:15:44Z

Is the current best practice to use dokku-logging-supervisord (which looks like it's had a bug for the last year that makes restarting and deploying a lot slower), or is crashed job restarting now built into dokku via. #1473?

i.e.: can I sleep soundly with the default dokku and this in my Procfile:

web:    bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq -C ./config/sidekiq.yml

christiangenco · 2016-02-26T19:58:36Z

Well, I guess not. This morning I had two workers with the above configuration crash without restarting.

The supervisord plugins look like they aren't being maintained anymore ("Not compatible with dokku 0.4+"), so what's the best practice for this?

Would there be any downsides to doing something like:

worker: while true; do bundle exec sidekiq -C ./config/sidekiq.yml; sleep 60; done

?

Edit: Per @josegonzalez's and @beverku's recommendation I'm manually enabling the --restart=unless-stopped option with:

dokku docker-options:add test-app deploy --restart=unless-stopped

I have a multiple server deployment, so in the name of science 🔬 I've only enabled it on one of the servers. I'll wait until the next crash and report back what happens (expecting the server I enabled --restart=unless-stopped to restart the worker and the other one to just crash again).

epixa · 2016-03-30T21:14:02Z

@christiangenco Have you encountered any crashes since your comment?

josegonzalez · 2016-03-30T21:16:09Z

@epixa please follow along in #1734. Thanks.

asm89 mentioned this issue Oct 2, 2013

Rails app, Procfile is used only for the web entry #181

Closed

asm89 closed this as completed Oct 24, 2013

josegonzalez reopened this Sep 17, 2015

josegonzalez mentioned this issue Sep 17, 2015

Handle crashing containers by using restart=on-failure policy #1473

Merged

josegonzalez closed this as completed in #1473 Sep 17, 2015

christiangenco mentioned this issue Feb 26, 2016

[RFC] Revisit docker restart policy #1734

Closed

christiangenco mentioned this issue Mar 16, 2016

Who uses dokku and for what purposes? #1878

Open

dokku locked and limited conversation to collaborators Mar 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto Restarting crashed apps #216

Auto Restarting crashed apps #216

jwarzech commented Sep 16, 2013

statianzo commented Sep 16, 2013

ghost commented Sep 16, 2013

jwarzech commented Sep 19, 2013

statianzo commented Sep 23, 2013

alexbeletsky commented Sep 25, 2013

statianzo commented Sep 25, 2013

alexbeletsky commented Sep 25, 2013

progrium commented Sep 25, 2013

ghost commented Sep 25, 2013

progrium commented Sep 25, 2013

andreypopp commented Sep 27, 2013

asm89 commented Oct 24, 2013

pnegahdar commented Oct 24, 2013

progrium commented Oct 24, 2013

pnegahdar commented Oct 24, 2013

progrium commented Oct 24, 2013

ElliotChong commented Apr 28, 2014

pnegahdar commented Apr 28, 2014

ElliotChong commented May 17, 2014

jaouadk commented May 17, 2014

alanjds commented Sep 11, 2015

tmikoss commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

michaelshobbs commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

tmikoss commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

christiangenco commented Feb 4, 2016

christiangenco commented Feb 26, 2016

epixa commented Mar 30, 2016

josegonzalez commented Mar 30, 2016

Auto Restarting crashed apps #216

Auto Restarting crashed apps #216

Comments

jwarzech commented Sep 16, 2013

statianzo commented Sep 16, 2013

ghost commented Sep 16, 2013

jwarzech commented Sep 19, 2013

statianzo commented Sep 23, 2013

alexbeletsky commented Sep 25, 2013

statianzo commented Sep 25, 2013

alexbeletsky commented Sep 25, 2013

progrium commented Sep 25, 2013

ghost commented Sep 25, 2013

progrium commented Sep 25, 2013

andreypopp commented Sep 27, 2013

asm89 commented Oct 24, 2013

pnegahdar commented Oct 24, 2013

progrium commented Oct 24, 2013

pnegahdar commented Oct 24, 2013

progrium commented Oct 24, 2013

ElliotChong commented Apr 28, 2014

pnegahdar commented Apr 28, 2014

ElliotChong commented May 17, 2014

jaouadk commented May 17, 2014

alanjds commented Sep 11, 2015

tmikoss commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

michaelshobbs commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

tmikoss commented Sep 17, 2015

josegonzalez commented Sep 17, 2015

christiangenco commented Feb 4, 2016

christiangenco commented Feb 26, 2016

epixa commented Mar 30, 2016

josegonzalez commented Mar 30, 2016