Add thread reaping to thread pool #702

OleMchls · 2015-05-20T13:27:09Z

Hi there.

The Problem

As already discussed in #691 and #681 Puma does not recover from dead threads. Even though this is a fairly uncommon state and might only happen in edge cases. One of these edge cases seems to be using rack-timeout as you can see in zombocom/rack-timeout#76 and rails/rails#1627 (comment).

To reproduce this behavior use the following example rack app:

# dummy.ru
class SimpleRacktimeout # stand-in for a more complex Rails app using all kind of different non rubycode ressouces
  def initialize(app)
    @app = app
  end

  def call(env)
    p '########### HELLO FROM APP ###########'
    begin
      app_thread     = Thread.current
      timeout_thread = Thread.start do
        app_thread.raise(Exception)
      end

      `sleep 1`
    ensure
     timeout_thread.kill
     timeout_thread.join
   end
  end
end

app = Rack::Builder.new do |b|
  b.use SimpleRacktimeout
  b.run lambda { |_env| [200, {}, ['ok']] }
end.to_app

run app

Start it with puma -w 1 -b tcp://localhost -p 9292 -t 3:3 dummy.ru. And then force Puma to concurrently serve more requests than it has (healthy) threads available by using Apache Benchmark ab -n 8 -c 4 http://127.0.0.1:9292/.

Solution

This PR implements a thread reaping mechanism based on the auto trim functionality. It decreases the @spawned counter in the ThreadPool in order to allow it to spawn new threads.

Alternative Solution

Instead you can use th.abort_on_exception = true in https://github.com/puma/puma/blob/master/lib/puma/thread_pool.rb#L112 which also solves the problem described above. We decided for the former solution because removing dead threads from the pool seemed a cleaner approach.

As mentioned in #691 we are not sure if and how active record connections in dead threads will be cleaned up.

any idea about how to kill active record connections of dead threads?

/cc @Partyschaum

evanphx · 2015-05-20T17:44:28Z

Seems like we should attack the initially problem, i.e. why threads in the pool are dying. I guess an exception is bubbling up to the top-level?

OleMchls · 2015-05-20T23:04:19Z

I'm not completely sure what is killing the threads, from my educated guess, it's the fact that the exception is raised from another thread and thus bubbling up and killing the thread.

All the issues we've linked above and also our initial problem we've faced is that heroku recently switched their recommended web server from unicorn to puma (ref), but due to the fact that puma does not have a request timeout rack-timeout is recommended. And from what I've seen in our app as well in a bunch of issues all over different ruby repos (rails/puma/rack-timeout) is that this combination is not working pretty well, I'm not sure why this happens exactly. But it seems like the ActiveRecord connection pool is having a lot of trouble with that as we as the puma thread pool. From our investigations all the following reported problems are caused by a combination of rack-timeout not playing well with puma.

Maybe pulling in @schneems and @kch might be a good idea.

OleMchls · 2015-05-21T08:28:17Z

@evanphx we are wondering what do you exactly mean with top level?

kch · 2015-05-21T10:45:38Z

rack-timeout is kind of terrible and currently doesn't play well with puma, I'm working on a bunch of changes to it, which should remedy that.

But it seems to me threads could die for a number of valid reasons and puma should reap them when that happens.

OleMchls · 2015-05-21T12:36:44Z

@kch I agree on the point that dead threads should be reaped, that is why we proposed this PR. On the other hand, could you maybe (in the rack-timeout repo) a bit more transparent on what kind a changes you are working and if you might need help? Happy to help :) Because of the issue mentioned we now switched back to unicorn.

knightq · 2015-05-28T10:05:34Z

+1 for merge and release

sgranata82 · 2015-05-30T19:36:19Z

+1

SebastianEdwards · 2015-06-03T03:51:07Z

+1

deepj · 2015-06-03T20:54:20Z

+1

knightq · 2015-06-08T08:13:18Z

We'll switch back to unicorn too until this will be resolved: it's source of dangerous raising in request queueing during high request burst periods.

schneems · 2015-06-08T16:08:59Z

Or you could run off of 1 thread and multiple puma workers which is essentially doing the same thing as Unicorn.

Add thread reaping to thread pool

acrolink · 2016-07-16T12:38:40Z

I had encountered the issue of request time-outs when I was testing my app using httperf tool (conns 500, rate 50, timeout 10). Puma failed badly,out of 500 requests: 181 status 200 OK and 319 time-out.

Any new updates on this?

emerleite · 2016-07-27T20:08:49Z

@acrolink same here. Sometimes I have only one thread working :(

acrolink · 2016-07-27T20:30:53Z

@emerleite .. If you don't mind trying a new framework I highy recommend Phoenix (upon Elixir) .. It is super fast compared to Ruby's Rails (5 or 6 times faster). As for Puma, I have kept using Unicorn for the meantime.

emerleite · 2016-07-27T20:34:15Z

We're currently doing that, but the Rails app must be running for now :)

evanphx · 2016-08-03T21:39:26Z

@acrolink Can you tell me more about your setup?
@emerleite Can you tell me more about your setup?

Add thread reaping to thread pool

5e95791

evanphx added a commit that referenced this pull request Jun 10, 2015

Merge pull request #702 from Jimdo/thread-reaping

cd82b02

Add thread reaping to thread pool

evanphx merged commit cd82b02 into puma:master Jun 10, 2015

benoittgt mentioned this pull request Feb 26, 2016

Memory optimisation since 2.16.0? #903

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add thread reaping to thread pool #702

Add thread reaping to thread pool #702

OleMchls commented May 20, 2015

evanphx commented May 20, 2015

OleMchls commented May 20, 2015

OleMchls commented May 21, 2015

kch commented May 21, 2015

OleMchls commented May 21, 2015

knightq commented May 28, 2015

sgranata82 commented May 30, 2015

SebastianEdwards commented Jun 3, 2015

deepj commented Jun 3, 2015

knightq commented Jun 8, 2015

schneems commented Jun 8, 2015

acrolink commented Jul 16, 2016

emerleite commented Jul 27, 2016

acrolink commented Jul 27, 2016

emerleite commented Jul 27, 2016

evanphx commented Aug 3, 2016

Add thread reaping to thread pool #702

Add thread reaping to thread pool #702

Conversation

OleMchls commented May 20, 2015

The Problem

Solution

Alternative Solution

evanphx commented May 20, 2015

OleMchls commented May 20, 2015

OleMchls commented May 21, 2015

kch commented May 21, 2015

OleMchls commented May 21, 2015

knightq commented May 28, 2015

sgranata82 commented May 30, 2015

SebastianEdwards commented Jun 3, 2015

deepj commented Jun 3, 2015

knightq commented Jun 8, 2015

schneems commented Jun 8, 2015

acrolink commented Jul 16, 2016

emerleite commented Jul 27, 2016

acrolink commented Jul 27, 2016

emerleite commented Jul 27, 2016

evanphx commented Aug 3, 2016