-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add out_of_band hook #1648
add out_of_band hook #1648
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the impact of this feature, namely that it enables folks to do work out of band from requests but doesn't explicitly do any of that work, I'm inclined to accept this.
I'd like some thoughts from @schneems and @nateberkopec too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall seems like an good experiment. Have you tried running with this patch in production? Any numbers to share?
@@ -201,7 +204,8 @@ def wait_until_not_full | |||
# is work queued that cannot be handled by waiting | |||
# threads, then accept more work until we would | |||
# spin up the max number of threads. | |||
return if @todo.size - @waiting < @max - @spawned | |||
busy_threads = @spawned - @waiting + @todo.size | |||
return busy_threads if @max > busy_threads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really jumpy around this line of code. This controls a disproportionate amount of what puma does compared to it's size. While the method is only a few lines of code there's 30 lines of docs (most of which I wrote).
Original code
@todo.size - @waiting < @max - @spawned
Can be re-written as
@todo.size - @waiting + @spawned < @max
or flipping it
@max > @todo.size - @waiting + @spawned
re-aranging
@max > @spawned - @waiting + @todo.size
And finally
busy_threads = @spawned - @waiting + @todo.size
@max > busy_threads
So that looks fine. Just makes me nervous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other question to ask: is this the correct logic.
@waiting
is the count of threads that have been spawned but are idle. So busy count would be the spawned count minus the idle count.
We then need to handle for the case where a request has been processed but not picked up yet so we subtract the backlog.
So that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less confident the logic is correct to begin with, only that it's unchanged by the algebraic refactoring.
In fact, during some local benchmarking I noticed some weird anomalies where busy_threads
was logged as 2
after only a single request was added to the worker-process, so I do suspect there might be some concurrency edge-case not handled perfectly by the current logic.
Anyway, it's certainly possible to leave the equation untouched by this PR if that makes it any less nerve-wracking. I figured making the 'busy threads' count more explicit in this method makes the logic easier to follow, but the change isn't strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after only a single request was added to the worker-process
Was this a web page load or did you make the request via curl? if it's a webpage don't forget there are assets that each require a different request.
there might be some concurrency edge-case not handled perfectly by the current logic.
Yes there's a known race condition between the time where a request is added to the @todo array and when a new thread picks it up and starts working on it. So it's possible that @todo.size might be greater than 0 but all threads be idle.
Anyway, it's certainly possible to leave the equation untouched by this PR if that makes it any less nerve-wracking. I figured making the 'busy threads' count more explicit in this method makes the logic easier to follow, but the change isn't strictly necessary.
I'm fine with the change just wanted to be sure I talked through it out loud and it preserved behavior.
I just ran a ~12-hour side-by-side comparison of two Puma servers running Ruby 2.5, with/without the OOBGC from tmm1/gctools#17 along with this PR. The difference on my current production workload is slight but still noticeable.
So both response duration and memory utilization are slightly improved by about ~7-8% running out-of-band GC on my workload. |
This PR adds an
out_of_band
hook that is invoked when the worker is idle.The hook runs immediately after a request has finished processing and there are no busy threads on the worker. The worker doesn't accept new requests until this code finishes.
This hook can be used for running out-of-band garbage collection as a solution to #450, e.g.:
It could also be useful for scheduling/deferring other asynchronous tasks to be run during idle periods on the server.