Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puma not restarting (or at least not loading new current) when notified with SIGUSR2 #416

Closed
gamafranco opened this issue Nov 24, 2013 · 33 comments

Comments

@gamafranco
Copy link

When my cap deploys, it sends a SIGUSR2 to Puma - tried manually with kill and with pumactl.

pumactl says the server will restart, but puma doesn't run the new code (the message is ' Command restart sent success').

To get the new code served, puma needs an hard restart (i.e. need to send it a SIGINT and then start).

I can't confirm if with SIGUSR2 puma doesn't restart or simply doesn't change working directory - as there is a new deploy. What is clear is that it keeps running the old code.

Setup is ubuntu 12.04, rails 4 and Ruby 1.9.3. Puma is 2.6.0.

Thanks.

@cknoxrun
Copy link

cknoxrun commented Dec 6, 2013

I am having the same problem across different servers. Sometimes it doesn't restart, other times it seems that the process is stopped but never started again. This is on 2.7.1.

Edit: Actually, this may be that I deployed between the update from 2.6 -> 2.7.1. I'll deploy a bunch more today so I'll see if the problem continues with 2.7.1.

@totallymike
Copy link

I've experienced something like this.

bundle exec pumactl -F config/puma.rb restart wherein my puma.rb contains the pidfile, working directory, binding, etc.

It reports "Command restart sent success", and then it just kills the puma workers and the master process ends. Here's what the log says:

[17058] - Gracefully shutting down workers...
[17058] * Restarting...
Available commands: halt, restart, phased-restart, start, stats, status, stop

@simsicon
Copy link

Same here, I have to stop then start puma server sometimes. I've noticed the PID actually changed when running restart or phased-restart, but still applying the old code.

@evanphx
Copy link
Member

evanphx commented Jan 25, 2014

What version of ruby are you using?

@simsicon
Copy link

@evanphx The development side is 2.0.0-p353 and the production is 2.0.0-p195, and puma version is 2.7.1, is that help? Thanks.

@gamafranco
Copy link
Author

ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-linux]

@paustin01
Copy link

I'm seeing the same thing. Running "pumactl -S /my_app/path/shared/pids/puma.state restart" (or phased-restart) keeps the master process pid, and the 2 clustered workers restart with new pids, but the master and clustered workers continue to work out of the previous release path.

I'm running puma (2.7.1) on ruby 1.9.3p448 on ubuntu.

Excerpt from ps:

$ sudo ps -ef |grep 29171
puma     27146 29171  0 18:32 ?        00:00:02 puma: cluster worker: 29171
puma     27150 29171  0 18:32 ?        00:00:06 puma: cluster worker: 29171
puma     29171     1  0 Jan13 ?        00:05:11 /usr/local/ruby/1.9.3-p448/bin/ruby /my_app/path/current/bin/puma -C /my_app/path/shared/config/puma.rb

Excerpt from lsof: (note the current working directory shown for the process is correct but gems are running out of the old path)

$ sudo lsof |grep releases
ruby      27146       puma  cwd       DIR              202,1      4096     401852 /my_app/path/releases/f732c7...
ruby      27146       puma  mem       REG              202,1     85394     279105 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/bcrypt-ruby-3.1.2/lib/bcrypt_ext.so
ruby      27146       puma  mem       REG              202,1   1013965     289051 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/eventmachine-1.0.3/lib/rubyeventmachine.so
...

ruby      27150       puma  cwd       DIR              202,1      4096     401852 /my_app/path/releases/f732c7...
ruby      27150       puma  mem       REG              202,1     85394     279105 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/bcrypt-ruby-3.1.2/lib/bcrypt_ext.so
ruby      27150       puma  mem       REG              202,1   1013965     289051 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/eventmachine-1.0.3/lib/rubyeventmachine.so
...

ruby      29171       puma  cwd       DIR              202,1      4096     401852 /my_app/path/releases/f732c7...
ruby      29171       puma  mem       REG              202,1     85394     279105 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/bcrypt-ruby-3.1.2/lib/bcrypt_ext.so
ruby      29171       puma  mem       REG              202,1   1013965     289051 /my_app/path/releases/c81ea3.../vendor/bundle/ruby/1.9.1/gems/eventmachine-1.0.3/lib/rubyeventmachine.so
...

Any ideas??

@paustin01
Copy link

Update: We did an strace on the puma pid and watched it during a pumactl restart. We noticed that everything was running out of the correct path until the standard rails boot.rb file (/my_app/path/[new_release]/config/boot.rb) fired and loaded the Gemfile from the previous release's path. When looking at boot.rb it was clear that it if the environment variable BUNDLE_GEMFILE already exists, it would use that rather than re-evaluating and setting it. That env would have already been set in the puma session by /my_app/path/[old_release]/bin/puma (the binstub created by bundler) and rather than using "/my_app/path/current/Gemfile" it would have used the ".realpath". In short, the final value of BUNDLE_GEMFILE would be '/my_app/path/[old_release]/Gemfile' rather than the desired '/my_app/path/current/Gemfile' and so all gemfiles would run out of that path. Our solution is to set the BUNDLE_GEMFILE environment variable in our puma upstart script to "/my_app/path/current/Gemfile".

@jwreagor
Copy link

I can confirm @paustin01's findings because my Puma process's proc directory enviro file holds the previous revision's BUNDLE_GEMFILE and not the new release directory which Puma is running from (as mentioned previously, observable via lsof).

cat /proc/cat /your/puma.pid/enviro

@schneems
Copy link
Contributor

Can anyone provide a simple script or program that can be used to reproduce the behavior?

@jcoleman
Copy link

Is there a reason this got closed? This is still very much an issue, and it's trivial to reproduce: all you need is any project running puma from within a symlinked directory (e.g., deployed by capistrano), install new gems in a new revision, and reload puma with SIGUSR2 with re-use the BUNDLE_GEMFILE env var in Rail's boot.rb thus not loading new gems.

@sorentwo
Copy link
Contributor

This is definitely still an issue. It's the only reason our apps are still using Unicorn right now.

@jcoleman
Copy link

@sorentwo I worked around this issue by adding BUNDLE_GEMFILE=/deploy_path/current/Gemfile to my init.d script for Puma. If you're using Capistrano, you could similarly add that env var to the commands it runs.

@sorentwo
Copy link
Contributor

@jcoleman: thanks, that is really helpful to know about. I wasn't aware of the specific point of failure here (the gems/Gemfile), it just seemed like we had random restart failures.

@evanphx
Copy link
Member

evanphx commented Aug 22, 2014

@sorentwo Are you using the prune_bundler option? It's documented here: https://github.com/puma/puma/blob/master/DEPLOYMENT.md#restarting

@sorentwo
Copy link
Contributor

@evanphx: Nope, I wasn't. In fact, I'd never seen that wiki page. The last time I was setting things up was in November 2013, a little before the page was available. That's an excellent tip though. I'll try an experiment next week and let you know how things go.

@jcoleman
Copy link

@evanphx I didn't realize that the prune_bundler option existed either. My test case isn't easily reproducible now though because I had to restart puma with the updated config.

@cjbottaro
Copy link

We're seeing this issue also using Puma 2.9.0 deployed via Chef (it uses the same current symlink strategy as Capistrano). Puma is not using newly deployed code after being issued a SIGUSR2. Doing a stop/start gets Puma to use the new code.

@pencilcheck
Copy link

Phased restart still doesn't work even with @jcoleman's solution. Running puma 2.9.1.

But with hot restart, it seemed to work without setting the env variable.

@jcoleman
Copy link

jcoleman commented Nov 1, 2014

My solution ended up not fixing it for me either. The prune_bundler option frankly just doesn't work.

@evanphx
Copy link
Member

evanphx commented Nov 2, 2014

prune_bundler had a bad bug related to restarts that was fixed in 2.9.2. Did you try it?

@pencilcheck
Copy link

hot restart stops loading the new context in puma 2.9.2

@evanphx
Copy link
Member

evanphx commented Nov 2, 2014

Tell me more: what to you mean by new context? What does your config look like? Could you provide some output from puma?

@pencilcheck
Copy link

What I mean by context is the line "Gemfile in context: /path/to/Gemfile" in puma.log

threads 0, 16
workers 2

bind "unix:///var/run/wwm.sock"
daemonize false
activate_control_app 'unix:///var/run/pumactl.sock'
stdout_redirect 'log/puma.log', 'log/puma_error.log', true
pidfile 'tmp/pids/puma.pid'
state_path 'tmp/pids/puma.state'
prune_bundler

@mcfadden
Copy link

I'm seeing this same issue as well (phased restart doesn't load new code)

Running Puma 2.11.2 on Ruby 2.2.2

We're deploying with Capistrano.

Here's my puma.rb:

workers 2
threads 5, 5

prune_bundler

rackup      DefaultRackup
environment production

directory '/www/my-app/current'
bind "unix:///var/run/puma.sock"
state_path "/var/run/puma.state"
pidfile "/var/run/puma.pid"

We're issuing this command to start the phased restart:

sudo pumactl --config-file /www/my-app/current/config/puma.rb phased-restart

When we issue the phased-restart the worker processes do restart (new PIDs), but they don't get the new code.

Where should I look for more information to figure out what's happening?

@mcfadden
Copy link

After coming back to this on a fresh day, I'm really not sure what I was experiencing. At one point in time my restarts definitely weren't working properly (as in, the new code definitely wasn't being loaded). This could have easily been a misconfiguration on my part.

After I noticed the problem I started testing by looking at the output from ps aux which included the directory it had loaded in the tag. Ex:

puma: cluster worker 1: 918 [20150429172950]

When I was deploying new code I could see that worker restart, but the content loaded (20150429172950) wasn't changing, so I assumed it still wasn't loading the new code. I have no idea what I did to fix it, but it's loading the new code properly (with the configuration in my previous comment).

Perhaps I should open a new issue about updating the Process tag on the workers to match the new directory.

At the moment I'm not even sure where it's pulling the current release from since i'm running this out of the /www/my-app/current directory.

@Epigene
Copy link

Epigene commented Feb 25, 2016

To my surprise, I got this to work.

Using puma v2.15.3

# in /config/puma.rb
app = "my_app"
root = "/home/deployer/apps/#{app}"

workers  4
threads  1, 2

rackup      DefaultRackup
environment ENV['RAILS_ENV']
daemonize   true

directory "#{root}/current"
pidfile "#{root}/shared/tmp/pids/puma.pid"
stdout_redirect "#{root}/shared/log/#{Time.now.to_i}_production.log", "#{root}/shared/log/#{Time.now.to_i}_production_errors.log"
bind "unix:/tmp/puma.socket"

prune_bundler

It is crucial to stop and start puma with these settings to have the master process pick up :directory and, possibly, :prune_bundler directives.

Then you can bundle exec pumactl -p {pid_path} phased-restart

NB, that it may take a few moments for workers to pick up the new code, and also ps aux will incorrectly output workers' directory as the master processes' original directory, when the new code will actually be taken from the new symlinked current folder.

@treenewbee
Copy link

This issue seems to recur in 3.9.0 and 3.9.1.

3.8.2 works totally fine.

@pavelvasev
Copy link

In our case [ruby 2.3.3p222]

puma (3.10.0)
puma_worker_killer (0.1.0)

Puma reloads ok on SIGUSR2 and doesn't reload on SIGUSR1.

It feels like preload_app gets automatically turned on somehow.

@bf4
Copy link
Contributor

bf4 commented Oct 3, 2017

This may be related, using prune_bundler and SIGUSR2 via foreman

Just observed something interesting using puma's phased restarts, old code was running against new code in the same process.... I think it has to do with how long it takes for our app code to load, about 3-4 seconds, then boot, about 6-7 seconds

03/27/2017 11:17:19.000 -0500	 
  [32187] - Starting phased worker restart, phase: 4
  [32187] + Changing to /srv/rails_server
03/27/2017 11:17:20.000 -0500
  [32187] - Stopping 22094 for phased upgrade...
  [32187] - TERM sent to 22094...
  [1292] + Gemfile in context: /srv/rails_server/Gemfile

timestamp=2017-03-27T16:17:23Z env=production frontend=www level=error thread_id=9400300 
  got a  NoMethodError on a model that a had that method added in the new code

timestamp=2017-03-27T16:17:23Z env=production frontend=www level=info thread_id=9400300 
  next request is the same controller and method, but works

03/27/2017 11:17:33.000 -0500	
[32187] - Worker 0 (pid: 1292) booted, phase: 4
			   
timestamp=2017-03-27T16:17:33Z env=production frontend=www level=info thread_id=7420240  tag=rails message={"app_info":{"pid":1292,"load_time":"4.446515 seconds","boot_time":"7.682576 seconds","config":{"mysql":{"pool":64},"pg":{"pool":64},"dm":{"pool":64},"puma":{"workers":"2","threads":{"min":"64","max":"64"}},"haproxy":{"max_puma_connections":128},"sidekiq":{"concurrency":25,"timeout":8},"machine":{"processor_count":2,"physical_processor_count":1}}

			   
[32187] - Worker 0 (pid: 1292) booted, phase: 4
[32187] - Stopping 22148 for phased upgrade...
[32187] - TERM sent to 22148...
			   

03/27/2017 11:17:43.000 -0500	
[32187] - Worker 1 (pid: 1397) booted, phase: 4

timestamp=2017-03-27T16:17:43Z env=production frontend=www level=info thread_id=7420240  tag=rails message={"app_info":{"pid":1397,"load_time":"3.273302 seconds","boot_time":"6.384623 seconds","config":{"mysql":{"pool":64},"pg":{"pool":64},"dm":{"pool":64},"puma":{"workers":"2","threads":{"min":"64","max":"64"}},"haproxy":{"max_puma_connections":128},"sidekiq":{"concurrency":25,"timeout":8},"machine":{"processor_count":2,"physical_processor_count":1}},

[32187] - Worker 1 (pid: 1397) booted, phase: 4

IN our config/environment.rb tl;dr we

require 'benchmark'
require 'logger'

app_stats = {pid: Process.pid}

time = Benchmark.realtime {
  # Load the Rails application.
  require File.expand_path('../application', __FILE__)
}
app_stats[:load_time] = "#{format('%f', time)} seconds"

time = Benchmark.realtime {
  # Initialize the Rails application.
  Rails.application.initialize!
}
app_stats[:boot_time] = "#{format('%f', time)} seconds"

require File.expand_path('../../lib/app_formatter', __FILE__)
logger = Logger.new(STDOUT)
logger.formatter = AppFormatter.new
logger.info({app_info: app_stats}.to_json)

In our config/puma.rb, we have, in part:

workers 2
threads 32, 32
# AWS ELB "Idle timeout" is set to 30s. In order to prevent ELB 500s from unexpected connection closes, ensure that the backend persistent connection timeout is greater than this idle timeout. https://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/config-idle-timeout.html
worker_shutdown_timeout 32
prune_bundler

If I flatten out the timeline:

03/27/2017 11:17:19.000 -0500 Starting phased worker restart
~~ 11:17:22 app code is being loaded
03/27/2017 11:17:23.000 -0500 got exception
~~ 11:17:26 app starts booting
03/27/2017 11:17:33.000 -0500  app booted, new worker booted, old worker sent TERM

https://github.com/puma/puma/blob/master/DEPLOYMENT.md#restarting

Use phased-restart (SIGUSR1 or pumactl phased-restart). 
This tells the master to kill off one worker at a time and restart them in your new code. 
This minimizes downtime and staggers the restart nicely. 

WARNING This means that both your old code and your new code will be running concurrently. 
Most deployment solutions already cause that, but it's worth warning you about it again. 
Be careful with your migrations, etc!

@x-yuri
Copy link

x-yuri commented May 4, 2018

I experienced a similar issue with puma-3.8.2, where it fails to restart after adding more gems. But with puma >= 10.0.0 it works.

https://gist.github.com/x-yuri/52024f512bed39a2a0bd4c6e82d04c9f

P.S. Well, not that simple a script, but what did you expect?...

@gamafranco
Copy link
Author

Yeah. I'm closing this one and let people create separate ones. This is so old it probably has a 1m long beard...

@lukel97
Copy link

lukel97 commented Feb 4, 2022

To my surprise, I got this to work.

Using puma v2.15.3

# in /config/puma.rb
app = "my_app"
root = "/home/deployer/apps/#{app}"

workers  4
threads  1, 2

rackup      DefaultRackup
environment ENV['RAILS_ENV']
daemonize   true

directory "#{root}/current"
pidfile "#{root}/shared/tmp/pids/puma.pid"
stdout_redirect "#{root}/shared/log/#{Time.now.to_i}_production.log", "#{root}/shared/log/#{Time.now.to_i}_production_errors.log"
bind "unix:/tmp/puma.socket"

prune_bundler

It is crucial to stop and start puma with these settings to have the master process pick up :directory and, possibly, :prune_bundler directives.

Then you can bundle exec pumactl -p {pid_path} phased-restart

NB, that it may take a few moments for workers to pick up the new code, and also ps aux will incorrectly output workers' directory as the master processes' original directory, when the new code will actually be taken from the new symlinked current folder.

That was the trick for me too: We were using capistrano and so had symlinked deploys. Needed to specify directory '/var/www/app/current' for puma to actually exec our rails app from the new directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests