Add reload_worker_directory to pumactl #478

Merged
merged 1 commit into from Jan 20, 2015

Conversation

Projects
None yet
4 participants
Contributor

rubencaro commented Feb 25, 2014

This allows to renew the code directory for newly spawn workers from the master process, all from pumactl.

Provides a harmless workaround to #469 and #468.

This way when I perform a deploy with a phased-restart (using symlink deployment), and only after I ensure this is a good release, then I do pumactl reload-worker-directory. All workers automatically spawn by the master process will use the new code.

I perform two checks to ensure this is a good release. First I poll pumactl stats until I see the number of workers equal the number of booted_workers. Then I make a call to some private \release path to get the code git release it's running. Then I'm sure every worker is up and they are running the new code.

My current restart output looks like this:

...
Command phased-restart sent success
Wait to let puma start its thing...
[2014-02-25 15:03:11] I saw all puma's workers up! 
[2014-02-25 15:03:11] I saw current release on server! 
Command reload-worker-directory sent success
...
Contributor

rubencaro commented Feb 26, 2014

Be aware that you may need to configure some other things too:

  • rackup '/path/to/current/config.ru' on config/puma.rb to force puma to resolve current each time it invokes rackup file.
  • BUNDLE_GEMFILE=/path/to/current/Gemfile on start script to force bundler to resolve current each time puma restarts.

seuros commented Feb 26, 2014

Workers should reload directory automatically, it make no sense to add a extra command to reload the path.

I think this behavior should be chain-linked to phased-restart.

Contributor

rubencaro commented Feb 26, 2014

@seuros If directory is reloaded automatically then you can end up with a dead app, as newly spawned workers could try to load broken code. Take a look at #468 and #469.

seuros commented Feb 26, 2014

People who care about phased-restart should have at least tested if their app is booting before sending it to production.
It will be more hard to notice the error if the application still using the old code base but your change are not reflected.

Contributor

rubencaro commented Feb 26, 2014

@seuros 3 quick points come to me:

  • You will come to see one day that you can try your code, test it, probe it, make simulations, make alpha releases, beta releases, release it out of balancers and give it a controlled load, or whatever strategy you learn with time. No matter what. There's no environment like production environment. Therefore you can limit the number and size of your code breakages with all this stuff, but they will eventually happen. I would like to be prepared for it.
  • It is hard to notice the error (it's not, read this request's description!) if the application still using the old code base, but that's only because the application is still up! I guess that no downtime is a good thing. So maybe we should learn how to diagnose our app's status by other means than simply watching it die. I feel like I'm going that way. I would love to see no downtime ever for my apps. Even for those rare after beta bugs.
  • This is just a harmless option added to pumactl. Simply do not use it if you do no want to. This is an easy way (for me at least) to workaround one possible situation I have seen with puma deploys using symlink strategy and phased-restart.

I hope that helps you understand this pull request's motivation. Anyway, I'm open to suggestions about how to solve the problem in #468 and #469 .

Thanks.

seuros commented Feb 26, 2014

Thank you.

Contributor

rubencaro commented Feb 26, 2014

@seuros It's ok. Let's solve things.

Maybe we can provide puma with a way to be aware of the app's health, and so be able to automatically decide if it should change the worker's code base... It would then be able to notify when everything is not ok too.

Maybe a block passed on configuration that gets run at some specific point after restart?

Not sure about the whole restart process and where to put that control point. And then what kind of notifications could be sent. Should we wait for it to end? Should it be async and then be polled?

Any ideas?

seuros commented Feb 26, 2014

Sorry, i have difficulties explaining myself.
What i wanted to do is that instead of executing 2 command, Puma should do it automatically do this if a phased-restart is called.

It should restart and only show a error message if it fail :

Failure scenario :

Command phased-restart sent success
[2014-02-25 15:03:11] Error: Unable to boot with new codebase, using old one.
 Phased-restart failed

Success scenario :

Command phased-restart sent success
Contributor

rubencaro commented Feb 26, 2014

@seuros I'm sorry too then. I misunderstood your comments. Sorry.

Your two scenarios sound perfect to me. It would be great to get that from puma.

The main problem is how we get to that, because puma itself doesn't know how to tell that a worker is not only booted, but healthy enough to be considered stable code for future workers to be spawned.

If we were able to pass a block of application-specific code to puma, then it could run it to get the health level of each worker. This way decide if it is good enough to consider it stable code. If it's not good enough, then throw something like New codebase did not pass health checks, using old one. Phased-restart failed.

By getting the health level of a worker I mean nothing fancy. Just some plain vital signs gathered from within the beast itself. Although once you are there, any kind of test code can be run. Either way, it's app-specific code, it's up to you.

What do you think?

seuros commented Feb 26, 2014

Check #483

Contributor

rubencaro commented Feb 26, 2014

Nice! We already have the callback!

Then the only thing left to do is to put the health checking code in there, and then setup some way to tell puma if the new code is stable or not. Any way to do this?

seuros commented Feb 26, 2014

I'm wrong, this callback is not usefull since it will never be called if the application fail to boot.
also it will be run under another thread/context.

seuros commented Feb 26, 2014

@evanphx Any help here ?

Contributor

rubencaro commented Feb 26, 2014

@seuros I think you had at least some good point that put us closer to a nice solution.

The current on_worker_boot is useless for this. That I agree. It will run before the app is booted.

And the proposed after_worker_boot on #483 is also somewhat useless for this, as runs in a different context. That I agree too.

We should find a way to tell puma that the code is good from inside our app. Maybe there is already a reference to the master process, or maybe we need to call the proposed pumactl reload-worker-directory directly from our app's code once it checks it's own health.

@evanphx will know better, I'm ok?

Owner

evanphx commented Feb 26, 2014

Don't worry about the hooks, this is something that should be built-in to puma.

Probably the right way to handle it is, when doing a worker restart:

  • Attempt to boot as a worker
  • Rescue any exceptions seen while booting
  • Report the exception to the master
  • At this point, the master sees that no other workers will be boot properly either, so the phased restart is aborted and reported.
Contributor

rubencaro commented Feb 26, 2014

+1 for that

What's the status of this? We are having issues with phased restarts and symlinked deploys also.

evanphx added a commit that referenced this pull request Jan 20, 2015

Merge pull request #478 from rubencaro/master
Add reload_worker_directory to pumactl

@evanphx evanphx merged commit 85a2f5c into puma:master Jan 20, 2015

1 check failed

default The Travis CI build could not complete due to an error
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment