Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby 2.1: Out-of-Band GC #450

Closed
thehappycoder opened this issue Jan 30, 2014 · 13 comments
Closed

Ruby 2.1: Out-of-Band GC #450

thehappycoder opened this issue Jan 30, 2014 · 13 comments

Comments

@thehappycoder
Copy link

@thehappycoder thehappycoder commented Jan 30, 2014

http://tmm1.net/ruby21-oobgc/?utm_source=rubyweekly&utm_medium=email

Is it possible to do with puma?

@evanphx
Copy link
Member

@evanphx evanphx commented Feb 4, 2014

Yep, you can use actually the same code, though you'll need to wire it into puma differently because @tmm1 patches Unicorn on the fly to inject the code.

That being said, puma's use of threads makes the definition of "out-of-band" differently than Unicorn. Because when one thread finishes with a request, another thread might be in the middle of handling a request.

Now that I talk through it, I should see about adding an idle handler to puma. That could be used to trigger @tmm1's OOBGC when there are actually no other requests to handle.

Loading

@evanphx
Copy link
Member

@evanphx evanphx commented Feb 4, 2014

Ruby 2.1's GC improves throughput by deferring work, which makes the need for OOBGC less important.

Loading

@schneems
Copy link
Contributor

@schneems schneems commented Feb 19, 2014

@thehappycoder how about doing this on a fork and maybe showing some benchmarks. I agree with 2.1.0 it's less important though there could still be some speed benefits.

Loading

@tmm1
Copy link

@tmm1 tmm1 commented Feb 19, 2014

Loading

@akshayrawat
Copy link

@akshayrawat akshayrawat commented May 29, 2014

@evanphx : When using MRI Ruby 2.1.2 with Puma (say 1 worker with 8 threads), when is the GC run? Is it run by the parent worker process when all those threads become idle, or would it be run by the parent process as needed, even when those threads are busy processing requests?

And how would this behaviour be different in Ruby 2.0 (without deferred GC).

Also asked here.

Loading

@evanphx
Copy link
Member

@evanphx evanphx commented May 31, 2014

It runs whenever the VM decides to run it. Puma does nothing to control that nor can it really.

Loading

@evanphx evanphx closed this Jul 13, 2014
@ketan
Copy link

@ketan ketan commented Aug 2, 2014

I noticed that puma uses a rack.after_reply rack environment variable to perform stuff after a request is served out. We're using puma in worker mode for some of our legacy apps - and this rack middleware (adopted from passenger) seems to do fine for us.

Would the maintainers be interested in a pull request?

Be warned - this is a bad idea if you're using puma in threaded mode.

# config.ru
...
use PumaOOBGC, 10, Rails.logger
...
class PumaOOBGC
  def initialize(app, frequency, logger=nil)
    @app           = app
    @frequency     = frequency
    @logger        = logger
    @request_count = 0
    @mutex         = Mutex.new
  end

  def call(env)
    status, header, body = @app.call(env)

    if ary = env['rack.after_reply']
      ary << lambda {maybe_perform_gc}
    end
    [status, header, body]
  end

  def maybe_perform_gc
    @mutex.synchronize do
      @request_count += 1
      if @request_count == @frequency
        @request_count = 0
        t0 = Time.now
        disabled = GC.enable
        GC.start
        GC.disable if disabled
        @logger.debug "[OOBGC] [#{Process.pid}] Finished GC in #{Time.now - t0} sec" if @logger
      end
    end
  end
end

Loading

@schneems
Copy link
Contributor

@schneems schneems commented Aug 2, 2014

Be warned - this is a bad idea if you're using puma in threaded mode.

We probably wouldn't take something that doesn't work for everyone. You would need to introduce some kind of lock across all threads to let them know when you want to do a GC, wait till they're done responding, run GC. Then remove the lock to let them start receiving requests again.

Loading

@wjessop
Copy link

@wjessop wjessop commented Aug 17, 2018

I'd be really interested in Puma OOB GC functionality. Much as I like threads I really don't see much shift towards thread safety in a lot of Ruby gems which makes most larger apps with dependencies pretty much only safe to run single threaded, so a non-threaded only implementation would still be helpful.

Loading

@schneems
Copy link
Contributor

@schneems schneems commented Aug 17, 2018

Much as I like threads I really don't see much shift towards thread safety in a lot of Ruby gems

Between puma and sidekiq and passenger almost any mainstream gem out there is threadsafe.

Not sure if you saw but GitHub actually turned off their OOBGC and got something a 10% decrease in overall CPU utilization. While GC pauses might add some time to a request it is likely minuscule between incremental GC, lazy sweep, and generational GC an individual GC run shouldn't be taking a significant portion of a request.

If you're really gung-ho about adding OOBGC support let me know what kind of hooks Puma could introduce to allow you to build it as a third party gem.

Loading

@wjessop
Copy link

@wjessop wjessop commented Aug 17, 2018

I have no doubt that Puma, Sidekiq etc. are threadsafe, it's going to be a really basic Rails app that relies on just the big names though.

The app I'm working on right now has 374 dependencies (direct and indirect). A relatively small personal project has 112. I picked the first five dependencies at random of that larger project (that wasn't a Rails or a Rails dep, or a test dep) and checked the repos for mention of thread safety. They don't mention it.

I suspect (and I'd be interested to see real stats rather than my gut feelings) that most non-trivial applications will depend on a fairly large number of gems, very few of which will be either written with thread-safety in mind, or will have the thread-safety documented.

Not sure if you saw but GitHub actually turned off their OOBGC and got something a 10% decrease in overall CPU utilization.

I did see that just today, it was very interesting. The biggest thing I'm interested in there was the 25% reduction in response times. If that's really the case then it's a pretty big reason not to implement OOB GC, but I think I'd like more information, for instance, were they queueing requests behind the GC in a process rather than routing requests round it?

In terms of CPU though, we already trade CPU for user experience with when we gzip responses. There's a tradeoff there, and it's something we need to balance, but we're currently averaging about 14ms GC time in our responses (Ruby 2.3.4, yes we need to upgrade) so we'd be willing to trade at least some CPU time to drop that figure.

I got @ketan's middleware working and I'll do some testing to see what the tradeoffs are in our environment and I'll try to report back.

Thanks for the response!

Loading

@schneems
Copy link
Contributor

@schneems schneems commented Aug 17, 2018

I wasn’t saying that puma and sidekiq are threadsafe but rather apps that use puma and sidekiq in production are threadsafe. The dependencies that those apps use are either already threadsafe or have been found to have threading bugs in them and then been fixed.

If you don’t want to use threads with puma I would recommend using unicorn which will give you better performance as long as it is running behind nginx then you’re protected from slow clients. As a bonus then you can use the already written OOBGC.

Loading

@wjordan
Copy link
Contributor

@wjordan wjordan commented Sep 13, 2018

I've added PR #1648 that implements an out_of_band hook that can be used for implementing out-of-band GC in Puma.

OOBGC still provides my own production workload ~10% faster response times on Ruby 2.5 (see tmm1/gctools#16 (comment) for some details/discussion). Whether OOBGC ends up being useful or not will depend a lot on workload, exact tuning of GC variables, and also the exact OOBGC algorithm used.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants