Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Fibers don't seem to reuse threads #1443
I'm working on speeding up a Celluloid app, and I've noticed that jruby spends an awful lot of time in
This is jruby-1.7.9 and the behavior happens on both openjdk-1.6.0 and openjdk-1.7.0.
Minimum reproduction case: https://gist.github.com/cheald/8626163
Output is something like:
You'll notice that the thread IDs are monotonically increasing. This is consistent.
VisuamVM CPU sampling shows that a tremendous amount of time is being spent in createThread():
VM stats show a ton of threads created, even though the live threads are stable:
This feels an awful lot like Fibers are not correctly using a thread pool, and are instead creating new short-lived threads for every fiber, which is resulting in inordinate amounts of overhead.
Per the referenced Celluloid issue, I've implemented a Fiber Pool in Celluloid that improved the performance of my app (in terms of jobs/sec) by ~35x; it's completely eliminated the thread creation thrashing. This seems pretty conclusive to me that there's an issue with JRuby's fiber implementation.
One issue I've run into in the pooled implementation is that if you end up with a suspended fiber that belongs to a dead thread, there's no way to resume the fiber and let it finish executing. I think that the fix here is probably for JRuby to go ahead and terminate and fibers left outstanding when an owning thread terminates. Otherwise, you end up with Java threads that seem to never go away.
I have a patch to again use a thread pool built into JRuby. I'm not sure if it should improve perf more than this, but here's my numbers after about a minute running your script:
Patch is here: https://gist.github.com/headius/40ed4a4435e0a4e0ba44
Note that there's several levels of re-use possible for our native-threaded fibers. This patch implements only the first level, reusing native threads.
Try out my patch and see how much improvement you see in thrashing. This is not a final patch; we should probably have a separate thread pool for fibers so we can label them appropriately.
A small clarification: JRuby 1.7.4- did reuse native thread, ThreadContext, and RubyThread for rapid-fire creation of fibers, but did not do any of the scrubbing necessary to ensure data from one Fiber did not leak into the next Fiber. My patch re-enables reuse of the native thread, but also explicitly dumps the ThreadContext and RubyThread. That's why you see the Ruby Thread object_id still increasing with my patch.
Using your patch and this crappy benchmark I'm getting the following:
The thread pooling definitely makes a big difference, but it looks like fiber pooling is still a big win. The combination of the two is even better.