Skip to content

Reset ThreadPoolExecutor on fork. #501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 29, 2016
Merged

Reset ThreadPoolExecutor on fork. #501

merged 2 commits into from
Jan 29, 2016

Conversation

jdantonio
Copy link
Member

When a ThreadPoolExecutor in the forked process detects that a fork has occurred it immediately takes the following actions:

  • Clears all pending jobs from its queue (assuming they will be handled by the forking process).
  • Deletes all worker threads (they will have died during the fork).
  • Resets all job counters (these counts will be reflected in the forking process).
  • Begins posting new jobs as normal.

@jdantonio
Copy link
Member Author

This latest update actually clears the pool if it detects a fork. I've updated the test script to perform a fork using the example given here. The new implementation survives the fork and posts the new job. I ran the test script on my work computer (2.8 GHz Intel Core i7, 16 GB 1600 MHz DDR3) and the difference in performance is still minimal.

@matthewd
Copy link
Contributor

We cannot use Thread.main to determine if our process has forked because Ruby copies all attributes of Thread.main on the new fork.

FWIW, my imagined Thread.main-using solution would look something like this: https://gist.github.com/matthewd/aeb876b9fad8e189bcff (modulo a method on Worker to avoid the ivar get, of course).

But yeah.. my results with that script match yours: they're so close I can't even get the original, clearly-less-code, version to consistently bench as the fastest... and when they do, $$ seems to be faster than the thread check anyway.

@queue.clear
@ready.clear
@pool.clear
@ruby_pid == $$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty. Fixed.

@jdantonio
Copy link
Member Author

@matthewd @pitr-ch @chrisseaton Please review.


Some Ruby versions allow the Ruby process to be [forked](http://ruby-doc.org/core-2.3.0/Process.html#method-c-fork). Generally, mixing threading and forking is an [anti-pattern](https://en.wikipedia.org/wiki/Anti-pattern). Threading and forking are both concurrency techniques and mixing the two rarely provides a benefit. Moreover, Ruby does not copy any threads execept the main thread when forking. Thus threads created before the fork become unusable ("dead") in the forked process. This aspect of forking is a significant issue for any application or library which spawns threads, including but not limited to Concurrent Ruby. It is strongly advised that applications using `ThreadPoolExecutor` either directly or indirectly (via high-level concurrency abstractions like `Future` and `Actor`) do **not** also fork. Since Concurrent Ruby is a foundational library often used by gems which are in turn used by other applications it is impossible to prevent forking upstream. Concurrent Ruby therefore makes a few guaratnees about the behavior of `ThreadPoolExecutor` after forking.

*Concurrent Ruby guarantees that any thread pools created on by the forking process before the fork remain functional on the forked process after the fork. It also guarantees that jobs post to a thread pool before a fork remain on the thread pool in the forking process after the fork.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"created on by"


Maybe "remain on the thread pool in (only) the forking process"?

@jdantonio jdantonio changed the title [DO NOT MERGE] Testing threads and forking. Reset ThreadPoolExecutor on fork. Jan 27, 2016

Some Ruby versions allow the Ruby process to be [forked](http://ruby-doc.org/core-2.3.0/Process.html#method-c-fork). Generally, mixing threading and forking is an [anti-pattern](https://en.wikipedia.org/wiki/Anti-pattern). Threading and forking are both concurrency techniques and mixing the two is rarely beneficial. Moreover, Ruby does not copy any threads execept the main thread when forking. Thus threads created before the fork become unusable ("dead") in the forked process. This aspect of forking is a significant issue for any application or library which spawns threads. It is strongly advised that applications using `ThreadPoolExecutor`, either directly or indirectly (via high-level concurrency abstractions like `Future` and `Actor`), do **not** also fork. Since Concurrent Ruby is a foundational library often used by gems which are in turn used by other applications, it is impossible to predict or prevent upstream forking. Concurrent Ruby therefore makes a few guaratnees about the behavior of `ThreadPoolExecutor` after forking.

*Concurrent Ruby guarantees that all threads created by thread pools in the forking process before the fork remain only in the forking process after the fork. It also guarantees that jobs post to a thread pool before a fork remain on the thread pool in the forking process after the fork. Finally, it guarantees that thread pools copied to a forked process will continue to function normally on the forked process.*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but the terminology here confused me a couple times.
Maybe parent/child would already help to figure which process we are talking about.
Also, the first guarantee is not a Concurrent Ruby one but one of the semantics of fork(2) at the OS level, right?

Maybe something simpler along the lines of
"Current jobs will be processed by the parent process. The child process does not inherit any job." would help.

@schneems
Copy link
Contributor

Another angle of attack is to ask the the executor if it has a thread that is alive with Thread#alive? I benchmarked speed and it's in the same ballpark as $$ (though slightly slower), we could either loop through workers via Enumerable#detect or set up a "watchdog" thread that we would check every time to see if it is alive. I don't know if there's other occasions when threads might die unexpectedly other than via forking, or if it makes sense to turn this special failure case into a general failure case.


## Forking

Some Ruby versions allow the Ruby process to be [forked](http://ruby-doc.org/core-2.3.0/Process.html#method-c-fork). Generally, mixing threading and forking is an [anti-pattern](https://en.wikipedia.org/wiki/Anti-pattern). Threading and forking are both concurrency techniques and mixing the two is rarely beneficial. Moreover, Ruby does not copy any threads execept the main thread when forking. Thus threads created before the fork become unusable ("dead") in the forked process. This aspect of forking is a significant issue for any application or library which spawns threads. It is strongly advised that applications using `ThreadPoolExecutor`, either directly or indirectly (via high-level concurrency abstractions like `Future` and `Actor`), do **not** also fork. Since Concurrent Ruby is a foundational library often used by gems which are in turn used by other applications, it is impossible to predict or prevent upstream forking. Concurrent Ruby therefore makes a few guaratnees about the behavior of `ThreadPoolExecutor` after forking.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Moreover, Ruby does not copy any threads execept the main thread when forking."
s/Ruby/the Operating System/ I believe.
Also, it's not necessarily the main thread:
ruby -e 'p Thread.main; Thread.new { fork {p Thread.main; p Thread.current}}.join'

Maybe use the example of locks taken by threads (other than the one forking) as why it's an anti-pattern (the lock would remained locked forever and unlock-able in the child process).

@jdantonio jdantonio force-pushed the fork-in-the-road branch 2 times, most recently from 6b66cca to ab2825e Compare January 28, 2016 12:50
@jdantonio
Copy link
Member Author

Another update to the docs.

jdantonio added a commit that referenced this pull request Jan 29, 2016
@jdantonio jdantonio merged commit ffc3322 into master Jan 29, 2016
@jdantonio jdantonio deleted the fork-in-the-road branch January 29, 2016 13:23
@schneems
Copy link
Contributor

Can we get a release cut with this fix? Any major issues to date?

@jdantonio
Copy link
Member Author

I'll cut a release this week. I was just waiting to see if other bugs were reported.

@schneems
Copy link
Contributor

Great, thanks!

@jdantonio
Copy link
Member Author

@schneems v1.0.1 has been released. With this update you should now also be able to safely use require 'concurrent/future' and not experience problems on JRuby. Emphasis on "should." :-)

@schneems
Copy link
Contributor

Awesome! Thanks ❤️ we need to get sprockets-rails back in the green. The threads dying on fork was causing failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants