Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto Restarting crashed apps #216

Closed
jwarzech opened this issue Sep 16, 2013 · 33 comments
Closed

Auto Restarting crashed apps #216

jwarzech opened this issue Sep 16, 2013 · 33 comments

Comments

@jwarzech
Copy link

Is there a proposed solution for monitoring dokku apps and auto restarting if the crash? Would just setting up something like http://godrb.com/ work?

@statianzo
Copy link
Contributor

When putting together the dokku-shoreman plugin, I was thinking that supervisord would be a good candidate for monitoring apps. There's no ruby requirement and it's available in the ubuntu repos. Translating from a Procfile to a supervisord conf, like foreman does, wouldn't be bad either.

@ghost
Copy link

ghost commented Sep 16, 2013

I've been toying with RUnit, it's already working well with Docker, but I didn't get very far yet regarding integration with Dokku.

@jwarzech
Copy link
Author

Thanks for the suggestions, I'm starting to take a look into supervisord and runit, are there any good tutorials or examples out there to point me in the right direction with having it monitor a dokku/docker container?

@statianzo
Copy link
Contributor

@jwarzech I put together a supervisord runner plugin. Try it out if you'd like. https://github.com/statianzo/dokku-supervisord

@alexbeletsky
Copy link
Contributor

Thats great plugin @statianzo .. I'm currently thinking the way to integrate pm2, it's specialised for node.js apps, but it does job well. Would be happy if you have any suggestions how to do that plug.

@statianzo
Copy link
Contributor

@alexanderbeletsky Thanks. You could take a similar approach of generating a processes.json from a procfile. If you need more complicated behavior, you could support detecting a processes.json that already exists in the source before generating and use that (I was thinking about adding that to the supervisord plugin).

@alexbeletsky
Copy link
Contributor

awesome suggestions! I just playing pm2 locally now, to understand it
well.. as soon I'm there, try to pack the pluging.

On Wed, Sep 25, 2013 at 6:30 PM, Jason Staten notifications@github.comwrote:

@alexanderbeletsky https://github.com/alexanderbeletsky Thanks. You
could take a similar approach of generating a processes.json from a
procfile. If you need more complicated behavior, you could support
detecting a processes.json that already exists in the source before
generating and use that (I was thinking about adding that to the
supervisord plugin).


Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-25097272
.

Alexander Beletsky,
http://beletsky.net

@progrium
Copy link
Contributor

My initial thought is that it should be managed with Upstart. When an app is deployed, an Upstart job is created for each process. Then it's a matter of exposing the management of them via dokku commands (or at least the common operations)

@ghost
Copy link

ghost commented Sep 25, 2013

My initial thought is that it should be managed with Upstart. When an app is deployed, an Upstart job is created for each process. Then it's a matter of exposing the management of them via dokku commands (or at least the common operations)

I'm personally more in favor of Runit, but that's mainly because I have experience with it. What I'd love to see is a refactoring of the app creation/restarting that caters for different process managements plugins.

@progrium
Copy link
Contributor

Well there is an assumption of Ubuntu and Upstart is the Ubuntu way to do
this. And we're already using it for Nginx. But you're right in that we
should try and do it as a plugin so that people can make their own if they
wish.

On Wed, Sep 25, 2013 at 4:16 PM, Lars Gierth notifications@github.comwrote:

My initial thought is that it should be managed with Upstart. When an app
is deployed, an Upstart job is created for each process. Then it's a matter
of exposing the management of them via dokku commands (or at least the
common operations)

I'm personally more in favor of Runit, but that's mainly because I have
experience with it. What I'd love to see is a refactoring of the app
creation/restarting that caters for different process managements plugins.


Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-25125492
.

Jeff Lindsay
http://progrium.com

@andreypopp
Copy link

I've integrated upstart with my fork of dokku — μPaaS (basically dokku with plain Dockerfiles and Makefiles instead of Heroku buildpacks to specify stack and build operations, everything else is just Dokku). All integration is in a single commit — andreypopp/upaas@54569cc

It's a little sophisticated cause I don't use PID-1 upstart for security reasons and instead spawn a new upstart session specifically for git user.

As a bonus point I get easy start/stop/restart commands

sudo -i -u git initctl --user start|stop|restart <myapp>

I think it would be easy to backport to dokku if necessary.

@asm89
Copy link
Contributor

asm89 commented Oct 24, 2013

Added this as potential improvement for 0.3.0 for now. :)

@asm89 asm89 closed this as completed Oct 24, 2013
@pnegahdar
Copy link

Can we get a discussion on why supervisor shouldn't be the default deploy mechanism for all dokku apps? See my pull request on buildstep to solve this here: progrium/buildstep#43

A couple reasons for it:

  1. It runs bash so should make no difference in any env/language
  2. Autorestart to keep apps alive
  3. Multiple processes (web, workers, tasks, schedulers, etc) similar to heroku.

@progrium
Copy link
Contributor

First, upstart is preferred because we're targeting Ubuntu. Second, Heroku
doesn't actually automatically start non web processes. You have to start
them manually afaik.

But autorestarting is great.

On Thu, Oct 24, 2013 at 5:54 PM, Parham Negahdar
notifications@github.comwrote:

Can we get a discussion on why supervisor shouldn't be the default deploy
mechanism for all dokku apps? See my pull request on buildstep to solve
this here: progrium/buildstep#43progrium/buildstep#43

A couple reasons for it:

  1. It runs bash so should make no difference in any env/language
  2. Autorestart to keep apps alive
  3. Multiple processes (web, workers, tasks, schedulers, etc) similar to
    heroku.


Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-27039860
.

Jeff Lindsay
http://progrium.com

@pnegahdar
Copy link

@progrium hmm Ill look into upstart.

I think thats simply an issue of billing (first dyno free, worker processes = 2nd dyno) which is why they force you to start to acknowledge the fact that you're going to be paying for that second dyno. Perhaps we're going to keep things herokuish (see #225) but in this seems way out of that scope.

@progrium
Copy link
Contributor

What's out of scope?

On Thu, Oct 24, 2013 at 6:08 PM, Parham Negahdar
notifications@github.comwrote:

@progrium https://github.com/progrium hmm Ill look into upstart.

I think thats simply an issue of billing (first dyno free, worker
processes = 2nd dyno) which is why they force you to start to acknowledge
the fact that you're going to be paying for that second dyno. Perhaps we're
going to keep things herokuish (see #225#225)
but in this seems way out of that scope.


Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-27041052
.

Jeff Lindsay
http://progrium.com

@ElliotChong
Copy link
Contributor

Was there ever any official progress on the issue of auto-restarting crashed apps?

@pnegahdar
Copy link

dokku-supervisord is a great solution and actively maintained. https://github.com/statianzo/dokku-supervisord

@ElliotChong
Copy link
Contributor

Thanks @pnegahdar!

@jaouadk
Copy link

jaouadk commented May 17, 2014

Here is a much more useful supervisor plugin providing Individual app logs on the host https://github.com/sehrope/dokku-logging-supervisord
it's based on the plugin mentionned by Pnegahdar

@alanjds
Copy link
Contributor

alanjds commented Sep 11, 2015

I remember Heroku to cease an desist of your app if it crashes and restart too often. I kind of liked this behavior in fact...

@tmikoss
Copy link

tmikoss commented Sep 17, 2015

What about dockers own --restart=always run option? Would that interfere in any way with how dokku handles the containers?

@josegonzalez
Copy link
Member

@michaelshobbs We've definitely discussed this before, but I don't remember why we haven't added --restart=always. I assume thats because it conflicts with the zero-downtime checks, but I can't be sure. It seems as though we verify that the same container is still up, so maybe we can safely add it. It doesn't appear that we terminate the container on failure though, which we might consider a bug... Thoughts?

Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely.

@michaelshobbs
Copy link
Member

Yeah the default zero downtime check relies on containers exiting on error.

@josegonzalez
Copy link
Member

We can probably get around that by checking the number of restarts?

restarts=$(docker inspect -f "{{ .RestartCount }}" $DOKKU_APP_CONTAINER_ID)
[[ $restarts -ne 0 ]] && dokku_log_fail "App container failed to start!!"

Thoughts?

@josegonzalez
Copy link
Member

Note: that would leave the container restarting 5ever, so we probably want to at least call docker stop as well.

@josegonzalez
Copy link
Member

This could work:

container_restarts=$(docker inspect -f "{{ .RestartCount }}" $DOKKU_APP_CONTAINER_ID)
if [[ $container_restarts -ne 0 ]]; then
  docker stop "$DOKKU_APP_CONTAINER_ID" || true
  dokku_log_fail "App container failed to start!!"
fi

@tmikoss
Copy link

tmikoss commented Sep 17, 2015

Maybe --restart=on-failure:XX would work, where XX is a reasonabe limit for retries? Though I was not able to find whether the limit gets reset after a container enters 'stable' state.

My use case as an app that dies maybe once in a couple of days due to extraordinary coincidences.

Edit: restart=always plus check for container restarts seems to be a better solution, as it covers more crash situations (like app repeatedly dying due to network errors until connectivity gets restored).

@josegonzalez
Copy link
Member

Dokku doesn't have an agent, so unfortunately we can't do smart crash restart policies like heroku does.

Perhaps the --restart=on-failure:N is a good alternative. I'll see what I can come up with.

@josegonzalez josegonzalez reopened this Sep 17, 2015
josegonzalez added a commit that referenced this issue Sep 17, 2015
Previously a crashed container would stay down, regardless of exit status. In some cases, it may be useful to restart the container. For example, an application may not be correctly implementing their error handling, or the crash may be caused by a transient error.

By setting the restart policy to `on-failure:N` - where N is a number of max restarts - we can help developers guard against crashing applications. Note that this is not a replacement for proper error handling, nor does this include notifications to a developer when a container is restarted. Those patterns should be implemented application side, or via a feature request to docker.

The value is configurable at the app-level by setting DOKKU_RESTART_LIMIT to a number. By default, containers will be restarted a max of 10 times. If a container crashes during the check-deploy plugin trigger, then the deploy will be marked as a failure.

Closes #216
Closes #398
Closes #1327
@christiangenco
Copy link
Contributor

Is the current best practice to use dokku-logging-supervisord (which looks like it's had a bug for the last year that makes restarting and deploying a lot slower), or is crashed job restarting now built into dokku via. #1473?

i.e.: can I sleep soundly with the default dokku and this in my Procfile:

web:    bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq -C ./config/sidekiq.yml

@christiangenco
Copy link
Contributor

Well, I guess not. This morning I had two workers with the above configuration crash without restarting.

The supervisord plugins look like they aren't being maintained anymore ("Not compatible with dokku 0.4+"), so what's the best practice for this?

Would there be any downsides to doing something like:

worker: while true; do bundle exec sidekiq -C ./config/sidekiq.yml; sleep 60; done

?

Edit: Per @josegonzalez's and @beverku's recommendation I'm manually enabling the --restart=unless-stopped option with:

dokku docker-options:add test-app deploy --restart=unless-stopped

I have a multiple server deployment, so in the name of science 🔬 I've only enabled it on one of the servers. I'll wait until the next crash and report back what happens (expecting the server I enabled --restart=unless-stopped to restart the worker and the other one to just crash again).

@epixa
Copy link

epixa commented Mar 30, 2016

@christiangenco Have you encountered any crashes since your comment?

@josegonzalez
Copy link
Member

@epixa please follow along in #1734. Thanks.

@dokku dokku locked and limited conversation to collaborators Mar 30, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests