-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster Recycler implementation #2570
Conversation
@trustinthis is a WIP please not pull in yet. |
@belliottsmith this is a port of your work but it seems like this cause some test failures. It seems like this is caused by how it handles multi-threaded uses. Maybe you can spot what the problem is. |
Sure. I'm flying back to the UK later today, but will try to figure out what's gone wrong in the meantime |
Just run the testsuite and you will see the failure
|
The problem appears to be that WriteTask and WriteAndFlushTask both extend OneTimeTask and are recycled. This worked until now because recycle simply discarded them in multi-threaded scenarios, but now they're being reused whilst still in use elsewhere. I have a relatively simple fix that only recycles once they have been unlinked in MpscLinkedQueue, but I want to run all of the tests again to check that's the only problem. I'm a little concerned about ChannelOutboundBuffer.close() as well, as it is unguarded and could result in multiple recycles. I'm also slightly concerned that the bitset isn't sufficient to ensure we don't recycle multiple times - it is possible for something to get recycled by both the original thread and a different thread and to not notice because the bitset is updated lazily - I think I have a relatively simple fix using a (non-atomic) counter of times it has been recycled by a different thread, and erroring if this goes up after recycling by the owning thread. It occurs to me that this might be a generally superior approach to avoiding multiple recycles than the bitset even on the owning thread. |
@belliottsmith sounds good... Please let me know once you have something for me to test! |
@belliottsmith will you fix it or should I do ? |
@normanmaurer if you can wait a couple of days I'll fix it :) |
@belliottsmith sure ... I'm busy anyway :) |
@belliottsmith I may work on this myself as we want to release 4.0.21.Final pretty soon. Can you just tell me how you planned to implement the "unlinked check" ? I'm not sure I really understand how you would like to fix this. |
Benchmark for the complete PR.. 4.0 branch
This branch:
|
@belliottsmith thanks again for your help... Please review a last time. @trustin please also review :) |
A few very minor comments:
|
@belliottsmith thanks for review... just addressed your comments :) |
LGTM |
@@ -928,15 +927,17 @@ public final void run() { | |||
ctx = null; | |||
msg = null; | |||
promise = null; | |||
recycle(handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx=null; msg=null; promise = null;
could be moved to the overridden clearMaybe()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work as clearMaybe() is called before run() is executed.
LGTM. Please merge once the two comments are addressed. Don't forget to mention @belliottsmith's handle in the comment message. :-) |
Motivation: Recycler is used in many places to reduce GC-pressure but is still not as fast as possible because of the internal datastructures used. Modification: - Rewrite Recycler to use a WeakOrderQueue which makes minimal guaranteer about order and visibility for max performance. - Recycling of the same object multiple times without acquire it will fail. - Introduce a RecyclableMpscLinkedQueueNode which can be used for MpscLinkedQueueNodes that use Recycler These changes are based on @belliottsmith 's work that was part of #2504. Result: Huge increase in performance. 4.0 branch without this commit: Benchmark (size) Mode Samples Score Score error Units i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 116026994.130 2763381.305 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 110823170.627 3007221.464 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 118290272.413 7143962.304 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 120560396.523 6483323.228 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 114726607.428 2960013.108 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 119385917.899 3172913.684 ops/s Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.617 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark 4.0 branch with this commit: Benchmark (size) Mode Samples Score Score error Units i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 204158855.315 5031432.145 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 205179685.861 1934137.841 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 209906801.437 8007811.254 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 214288320.053 6413126.689 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 215940902.649 7837706.133 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 211141994.206 5017868.542 ops/s Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.648 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark
cherry-picked into 4.0, 4.1 and master... Thanks again to @belliottsmith and @trustin! |
No description provided.