Puma hangs on shutting down workers when received SIGINT #1674

atitan · 2018-11-08T09:22:03Z

Steps to reproduce

bundle exec puma -C config/puma.rb

# in another terminal
kill -SIGINT {puma pid}

Expected behavior

[72484] Use Ctrl-C to stop
[72484] - Worker 0 (pid: 72516) booted, phase: 0
[72484] - Worker 1 (pid: 72517) booted, phase: 0
[72484] - Gracefully shutting down workers...
[72484] === puma shutdown: 2018-11-08 17:19:54 +0800 ===
[72484] - Goodbye!

Actual behavior

Workers are already shut down, but the cluster master hanged on Process.waitpid.

However, if I press ctrl+c after puma hanged , it terminated correctly.

[72484] Use Ctrl-C to stop
[72484] - Worker 0 (pid: 72516) booted, phase: 0
[72484] - Worker 1 (pid: 72517) booted, phase: 0
[72484] - Gracefully shutting down workers...

System configuration

Operating System: macOS 10.14.1
Ruby version: 2.5.3
Rails version: 5.2.1
Puma version: 3.12.0
Bundle version: 1.16.6

The text was updated successfully, but these errors were encountered:

ioquatix · 2018-12-08T23:47:05Z

I seem to start experiencing this with 2.6.0-rc1 on travis - maybe an odd coincidence?

https://travis-ci.com/socketry/falcon/jobs/163846367 - scroll to the end.

dentarg · 2019-01-07T15:34:06Z

@ioquatix I think you are seeing https://bugs.ruby-lang.org/issues/15499

We see it for our app using Puma cluster mode, it does not exit on SIGTERM. On Heroku, we are getting R12 - Exit timeout errors.

MSP-Greg · 2019-01-07T15:39:37Z

I believe this issue may have also been causing failing tests on Travis when ruby-head was 2.6.0dev, the tests are skipped on Windows/Appveyor due to lack of fork...

guilleiguaran · 2019-03-10T18:14:43Z

I think this is fixed by #1738 also

evanphx · 2019-03-12T00:02:09Z

This sounds like a bug in ruby 2.6.

ioquatix · 2019-03-12T00:12:56Z

Let me see if I can reproduce the issue in 2.6.1

MSP-Greg · 2019-03-12T00:45:26Z

I believe this issue has been around since 2018-Aug, between r64316 and r64376. Both trunk and 2.6.1 are failing currently, and have been since that time.

Some travis build info:

Failed builds
2999 ruby 2.6.0dev (2018-08-15 trunk 64376) [x86_64-linux] Address in Use
2998 ruby 2.6.0dev (2018-08-15 trunk 64376) [x86_64-linux] Address in Use

Passing builds
2997 ruby 2.6.0dev (2018-08-12 trunk 64316) [x86_64-linux] Cron
2993 ruby 2.6.0dev (2018-08-07 trunk 64208) [x86_64-linux] Cron
2981 ruby 2.6.0dev (2018-07-24 trunk 64026) [x86_64-linux]

dentarg · 2019-03-12T17:16:47Z

This sounds like a bug in ruby 2.6.

I think so too, 2.6 changed the behaviour of our specs, that started and stopped our application using Puma in cluster mode. I've extracted the code to https://github.com/dentarg/gists/tree/master/gists/ruby-bug-15499 to reproduce the problem. Before 2.6 our specs was able to stop Puma without leaving any Ruby processes behind.

I made some comments about this in https://bugs.ruby-lang.org/issues/15499 but it hasn't gone anywhere since. I've been thinking about opening a new bug, but we switched our application to use Puma in single mode so the issue hasn't been pressing. I can chime in if anyone here opens a new bug.

(Also, the Heroku R12 errors went away when switching to single mode.)

MSP-Greg · 2019-03-12T17:22:05Z

@dentarg

Is it a bug or a breaking change?

I think #1741 seems to be working, I don't know how hard it would be for you to test it...

dentarg · 2019-03-13T12:40:55Z

@MSP-Greg I have tested #1741, see https://github.com/dentarg/gists/tree/master/gists/ruby-bug-15499#261--httpsgithubcompumapumapull1741. Seems like it behaves as 2.5.3 does.

MSP-Greg · 2019-03-13T13:13:52Z

@dentarg

Thanks for testing.

Seems like it behaves as 2.5.3 does.

To clarify, that means that the PR fixes the issue you've had with Ruby 2.6/trunk?

dentarg · 2019-03-13T14:37:42Z

Yes, I think so, at least some of it. Something I didn't do but could be done: try to reproduce the problem on Heroku (but it looks like the original post is doing something similar)

MSP-Greg · 2019-03-13T15:03:18Z

try to reproduce the problem on Heroku

Maybe @schneems has seen the issue with 2.6/trunk somewhere?

AlexWayfer · 2019-03-14T10:00:22Z

It seems that this issue is also in Ruby 2.5.4 now.

mltsy · 2019-03-18T20:16:41Z

Not only that - puma doesn't work in clustered mode at all with this ruby update! I'll put in another ticket for that though.

guilleiguaran · 2019-03-18T20:23:50Z

@mltsy have you tested 2.5.5? afaik Ruby 2.5.5 was released to handle the regression in 2.5.4

mltsy · 2019-03-18T20:44:23Z

Ah! Indeed. I just saw, as I was going to put in a ticket, that there was actually ticket submitted and resolved for that very reason that I had missed previously because it was already closed :) Thanks! And it looks like that update also resolves the issue in this ticket - although perhaps it still exists in the Ruby 2.6 branch?

MSP-Greg · 2019-03-18T21:56:40Z

If anyone wants to try a PR that may solve the clustered issue, the following in a gemfile:

gem 'puma', :git => 'https://github.com/puma/puma.git', :ref => 'refs/pull/1741/head'

mltsy · 2019-03-18T22:39:10Z

That does seem to allow puma to shutdown gracefully in ruby 2.5.4 with that branch. 👍

(However, requests still don't get through in clustered mode in ruby 2.5.4 even with that branch)

MSP-Greg · 2019-03-18T22:44:45Z

Thanks for checking.

However, requests still don't get through in clustered mode in ruby 2.5.4

Not good. Any idea if 2.5.5 fixes it (or using 2.6.2)?

mltsy · 2019-03-18T22:47:38Z

2.5.5 does fix both issues, yeah!

MSP-Greg · 2019-03-18T22:49:16Z

Good. Is that using standard 3.12.0 or the git version with PR 1741?

mltsy · 2019-03-18T22:52:45Z

Yeah, ruby 2.5.5 fixes it even with standard 3.12.0 puma branch 😄

(And I'm not sure about ruby 2.6 - I can't seem to reproduce the problem with 2.6, and am not sure which version it exists in exactly)

MSP-Greg · 2019-03-18T22:56:47Z

Thanks again.

Yeah, ruby 2.5.5 fixes it even with standard 3.12.0 puma branch

I don't recall any issues with Ruby 2.5.x < 2.5.4, it was only 2.5.4 that broke.

Ruby trunk started failing on clustered shutdown/restart last year some time, so the patch was for compatibility with Ruby 2.6 and current trunk (2.7.0).

mltsy · 2019-03-18T23:06:26Z

Interesting... well when I ran it with Ruby 2.6.2, everything worked fine for me (with the standard 3.12.0 puma branch - did they fix it in 2.6.2?)

MSP-Greg · 2019-03-18T23:16:00Z

PR #1747 updated the Travis matrix from 2.6.1 to 2.6.2, and it & trunk seem to be failing (2.5.5 is ok):

https://travis-ci.org/puma/puma/builds/507190071

mltsy · 2019-03-18T23:38:01Z

Huh... I setup a docker-compose stack with ruby 2.6.2-p47 and puma 3.12.0 in clustered mode (-w 2 flag) and it exits fine in every way I've tested it... that's odd. I must not be hitting exactly the right test case?

guilleiguaran · 2019-03-20T03:06:54Z

@atitan you're right, the fix seems to work on Linux but fails on macOS, I just confirmed it in a machine with macOS.

/cc @MSP-Greg

evanphx · 2019-03-20T03:24:36Z

@guilleiguaran Can you give me more info about your macOS setup? I can't replicate the issue here.

evanphx · 2019-03-20T03:26:03Z

@atitan I need to get more information about your setup. Can you provide your config.rb and any info about your rails app?

atitan · 2019-03-20T03:56:26Z

@evanphx
I just created a new rails app to test with. Found out adding preload_app! in puma config hanged the cluster master.

Besides, if I don't preload the app, ActiveSupport constant will not be defined in on_worker_boot block.

My config looks like this

threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }
threads threads_count, threads_count

port        ENV.fetch("PORT") { 3000 }
environment ENV.fetch("RAILS_ENV") { "development" }
workers ENV.fetch("WEB_CONCURRENCY") { 2 }
preload_app!

before_fork do
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection_pool.disconnect!
  end
end
on_worker_boot do
  ActiveSupport.on_load(:active_record) do
    ActiveRecord::Base.establish_connection
  end
end
plugin :tmp_restart
persistent_timeout 620

evanphx · 2019-03-20T04:11:52Z

@atitan Sorry, still unable to reproduce it with an empty rails app. Can you send your Gemfile?

atitan · 2019-03-20T04:28:54Z

@evanphx

Here you are.
I noticed a strange situation that puma does not hang if I started it in production mode.

source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }

ruby '2.6.2'

gem 'rails', '~> 5.2.2', '>= 5.2.2.1'
gem 'sqlite3', '~> 1.3.6'
gem 'puma', '~> 3.11', github: 'puma/puma', branch: 'master'

gem 'bootsnap', '>= 1.1.0', require: false

group :development do
  gem 'listen', '>= 3.0.5', '< 3.2'
end

evanphx · 2019-03-20T04:38:30Z

Still unable to repro. Maybe can you push the rails app up to github and I can try it exactly as you have it?

evanphx · 2019-03-20T04:39:52Z

Here is what mine looks like:

~/tmp/puma-test master* 9s
❯ bundle exec puma -C config/puma.rb
[38384] Puma starting in cluster mode...
[38384] * Version 3.12.1 (ruby 2.6.2-p47), codename: Llamas in Pajamas
[38384] * Min threads: 5, max threads: 5
[38384] * Environment: development
[38384] * Process workers: 2
[38384] * Preloading application
[38384] * Listening on tcp://0.0.0.0:3000
[38384] ! WARNING: Detected 4 Thread(s) started in app boot:
[38384] ! #<Thread:0x00007fb93049b7e0@/Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/internals/thread_pool.rb:6 sleep> - /Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/rb-fsevent-0.10.3/lib/rb-fsevent/fsevent.rb:44:in `select'
[38384] ! #<Thread:0x00007fb93049b380@/Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/internals/thread_pool.rb:6 sleep> - /Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/event/config.rb:19:in `sleep'
[38384] ! #<Thread:0x00007fb93032faf0@/Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/internals/thread_pool.rb:6 sleep> - /Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/record/entry.rb:42:in `realpath'
[38384] ! #<Thread:0x00007fb93032f848@/Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/internals/thread_pool.rb:6 sleep> - /Users/evan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/listen-3.1.5/lib/listen/event/config.rb:19:in `sleep'
[38384] Use Ctrl-C to stop
[38384] - Worker 0 (pid: 38395) booted, phase: 0
[38384] - Worker 1 (pid: 38396) booted, phase: 0
^C[38384] - Gracefully shutting down workers...
[38384]     worker status: pid 38395 exit 0
[38384]     worker status: pid 38396 exit 0
[38384]     worker shutdown time:   0.51
[38384] === puma shutdown: 2019-03-19 21:37:58 -0700 ===
[38384] - Goodbye!

atitan · 2019-03-20T05:46:27Z

@evanphx
The key point here is don't use ctrl-c.
Use kill to send SIGINT to the process.
The Ctrl-C will always work even after it hanged.

atitan · 2019-03-20T06:35:16Z

To sum up, reproducing the issue requires:

macOS
puma in development and cluster mode
preload_app!
use kill to send signal instead of ctrl-c

guilleiguaran · 2019-03-20T07:12:35Z

I can reproduce in a new rails app using the same configuration that @atitan described but I'm not able to reproduce under test conditions or using a bare rack app:

# config.ru
run lambda { |env| [200, {'Content-Type'=>'text/plain'}, ["Hello Rack!"]] }

# puma_config.rb
threads 4, 4
workers 2
environment "development"
preload_app!

# Gemfile
source 'https://rubygems.org'

gem 'puma', github: 'puma/puma'
gem 'rack'

greggilbert · 2019-05-30T04:07:39Z

This issue hasn't been touched in awhile, but I'm seeing the same issue (or very similar).

Context:

Docker container, Debian 9.9 (stretch)
Ruby 2.6.3p62
Puma 3.12.1
Hanami 1.3.1

I'm invoking Puma using puma -C puma.rb. The process is being controlled via s6.

# puma.rb
bind 'tcp://127.0.0.1:9292'
workers 3

preload_app!

What I'm seeing now is that, if i do kill -SIGINT (or have my process manager send a SIGINT), then Puma shuts down gracefully. kill -SIGTERM, however, causes the shutdown to freeze, and I get a bunch of [ruby] <defunct> threads.

It seems like using SIGINT is a workaround for using SIGTERM, but I'm wondering if that will have any adverse effects on things.

Anyway, hopefully this helps. Happy to try out any potential fixes if need be.

dentarg · 2019-05-30T08:28:12Z

@greggilbert do you mind trying your repro with Puma from the master branch? related PR #1741 has been merged but not released yet

greggilbert · 2019-05-30T12:08:03Z

I updated my Gemfile:

gem 'puma', github: 'puma/puma', branch: 'master'

$ bundle list puma
/usr/local/bundle/bundler/gems/puma-d39fe92a2dd1

I ran bundle exec puma -C puma.rb and then brought up the process. In a separate terminal, kill -TERM $PID did, in fact, kill the process.

This is interesting, though: when I rebuilt my Docker container, during the bundle install step, it threw this warning:

Using puma 3.12.1 from git://github.com/puma/puma.git (at master@d39fe92)
/usr/local/lib/ruby/2.6.0/rubygems/ext/builder.rb:183: warning: conflicting chdir during another chdir block
/usr/local/lib/ruby/2.6.0/rubygems/ext/builder.rb:191: warning: conflicting chdir during another chdir block

And then when trying to run puma -C puma.rb, it threw this fatal error:

$ puma -C puma.rb
Traceback (most recent call last):
	2: from /usr/local/bundle/bin/puma:23:in `<main>'
	1: from /usr/local/lib/ruby/2.6.0/rubygems.rb:302:in `activate_bin_path'
/usr/local/lib/ruby/2.6.0/rubygems.rb:283:in `find_spec_for_exe': can't find gem puma (>= 0.a) with executable puma (Gem::GemNotFoundException)

Running bundle exec puma -C puma.rb worked fine, though. Is that the correct way to run Puma in this case?

nateberkopec · 2019-05-30T13:56:23Z

Nice, can anyone see if #1674 (comment) still reproduces?

hrparty · 2019-06-12T11:08:01Z

Nice, can anyone see if #1674 (comment) still reproduces?

I've had this problem for months (since updating to ruby 2.6) on my machine, running under the conditions in #1674 (comment).

3.12.1 does not fix the problem for me but I'm running my rails app on master now (with this in my gemfile: gem 'puma', github: 'puma/puma') and I don't see this anymore. I believe it got fixed in this commit c637ff0, because I started running on master with good results just after that was committed.

nateberkopec · 2019-06-12T18:02:04Z

Closing as fixed on master b/c I agree.

This will be in the next release after 3.12.1.

AlexWayfer · 2019-06-13T11:18:44Z

This will be in the next release after 3.12.1.

Do you have plans for date of next release?

Because the current stable version of puma is incompatible with the current stable version of Ruby (2.6).

nateberkopec · 2019-06-13T18:29:08Z

Maintainers are meeting tomorrow, I would expect a release Soon (TM).

Linking them all with this commit. puma#1741 puma#1755 puma#1674 puma#1730 puma#1720

Linking them all with this commit. #1741 #1755 #1674 #1730 #1720

morgler · 2021-05-04T16:28:09Z

I have a similar problem with puma 5.2.2 and ruby 2.7.1. Starting rails (6.0.3.6) with bundle exec rails s starts puma. Trying to shut down the server with Ctrl-C leaves puma hanging with the message Gracefully shutting down workers....

The problem does NOT appear, if I start the server using puma.

I tried with one worker and with two workers – same problem.

What can I do to fix this? What is the odd thing that rails s adds in comparison to just running puma?

nateberkopec · 2021-05-04T16:42:03Z

@morgler please start a new issue

dentarg mentioned this issue Jan 7, 2019

Travis - Ruby head/trunk test failures #1703

Closed

guilleiguaran mentioned this issue Mar 10, 2019

Use Process::WNOHANG flag when stoping child workers processes. #1738

Closed

MSP-Greg mentioned this issue Mar 13, 2019

cluster - worker shutdown - use WNOHANG with nil return tests #1741

Merged

calmyournerves mentioned this issue Mar 29, 2019

Hang on SIGTERM with ruby 2.6 in clustered mode #1755

Closed

nateberkopec added the bug label Apr 20, 2019

nateberkopec closed this as completed Jun 12, 2019

mfa777 mentioned this issue Jul 21, 2019

Asset path won't change after recompile mastodon/mastodon#11367

Closed

dentarg mentioned this issue Jul 26, 2019

Using at_exit and Signal.trap possibly creates a race-condition jruby/jruby#5437

Closed

dentarg added a commit to dentarg/puma that referenced this issue Aug 24, 2019

Note the Ruby 2.6 SIGTERM issue in the changelog

c0b5f53

Linking them all with this commit. puma#1741 puma#1755 puma#1674 puma#1730 puma#1720

dentarg added a commit to dentarg/puma that referenced this issue Aug 24, 2019

Note the Ruby 2.6 SIGTERM issue in the changelog

4e8751d

Linking them all with this commit. puma#1741 puma#1755 puma#1674 puma#1730 puma#1720

nateberkopec pushed a commit that referenced this issue Aug 28, 2019

Note the Ruby 2.6 SIGTERM issue in the changelog (#1928)

079dff5

Linking them all with this commit. #1741 #1755 #1674 #1730 #1720

puma locked as resolved and limited conversation to collaborators May 4, 2021

Puma hangs on shutting down workers when received SIGINT #1674

Puma hangs on shutting down workers when received SIGINT #1674

Comments

atitan commented Nov 8, 2018 • edited Loading

Steps to reproduce

Expected behavior

Actual behavior

System configuration

ioquatix commented Dec 8, 2018

dentarg commented Jan 7, 2019

MSP-Greg commented Jan 7, 2019

guilleiguaran commented Mar 10, 2019

evanphx commented Mar 12, 2019

ioquatix commented Mar 12, 2019

MSP-Greg commented Mar 12, 2019 • edited Loading

dentarg commented Mar 12, 2019 • edited Loading

MSP-Greg commented Mar 12, 2019

dentarg commented Mar 13, 2019

MSP-Greg commented Mar 13, 2019

dentarg commented Mar 13, 2019

MSP-Greg commented Mar 13, 2019

AlexWayfer commented Mar 14, 2019

mltsy commented Mar 18, 2019

guilleiguaran commented Mar 18, 2019

mltsy commented Mar 18, 2019

MSP-Greg commented Mar 18, 2019

mltsy commented Mar 18, 2019

MSP-Greg commented Mar 18, 2019

mltsy commented Mar 18, 2019

MSP-Greg commented Mar 18, 2019

mltsy commented Mar 18, 2019

MSP-Greg commented Mar 18, 2019

mltsy commented Mar 18, 2019

MSP-Greg commented Mar 18, 2019

mltsy commented Mar 18, 2019

guilleiguaran commented Mar 20, 2019

evanphx commented Mar 20, 2019

evanphx commented Mar 20, 2019

atitan commented Mar 20, 2019

evanphx commented Mar 20, 2019

atitan commented Mar 20, 2019

evanphx commented Mar 20, 2019

evanphx commented Mar 20, 2019

atitan commented Mar 20, 2019 • edited Loading

atitan commented Mar 20, 2019

guilleiguaran commented Mar 20, 2019

greggilbert commented May 30, 2019

dentarg commented May 30, 2019

greggilbert commented May 30, 2019 • edited Loading

nateberkopec commented May 30, 2019

hrparty commented Jun 12, 2019

nateberkopec commented Jun 12, 2019

AlexWayfer commented Jun 13, 2019

nateberkopec commented Jun 13, 2019

morgler commented May 4, 2021 • edited Loading

nateberkopec commented May 4, 2021

atitan commented Nov 8, 2018 •

edited

Loading

MSP-Greg commented Mar 12, 2019 •

edited

Loading

dentarg commented Mar 12, 2019 •

edited

Loading

atitan commented Mar 20, 2019 •

edited

Loading

greggilbert commented May 30, 2019 •

edited

Loading

morgler commented May 4, 2021 •

edited

Loading