Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding benchmark for thread-local object pool use case #374

Merged
merged 3 commits into from Mar 2, 2023

Conversation

franz1981
Copy link
Collaborator

@franz1981 franz1981 commented Feb 25, 2023

This is a naive attempt to emulate a silly (but very soon quite frequent, believe me) use case: object pooling without making the acquired object to escape the consumer thread.

This is a pattern very frequent in HFT, but not only: as mentioned before, with Virtual Thread(s) and Loom, we need alternatives to cover for the lack of per-carrier thread locals (see https://github.com/FasterXML/jackson-core/blob/390472bbdf4722fe058f48bb0eff5865c8d20f73/src/main/java/com/fasterxml/jackson/core/util/BufferRecyclers.java#L36).

The recent answer from Heinz K. at https://mail.openjdk.org/pipermail/loom-dev/2023-February/005320.html
made me ask; we suggest users to use our mpmc qs as object pool, but...they are the right/decent tools for the job?

With 3 threads:

Benchmark                               (burstSize)  (qCapacity)                      (qType)  (warmup)  Mode  Cnt     Score     Error  Units
QueueAsPoolBurstCost.acquireAndRelease            1       132000                         None      true  avgt   10    15.755 ±   0.309  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000           ArrayBlockingQueue      true  avgt   10   304.429 ±  14.945  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000        ConcurrentLinkedQueue      true  avgt   10   651.487 ±  27.480  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000  MpmcUnboundedXaddArrayQueue      true  avgt   10   404.714 ±   5.914  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000               MpmcArrayQueue      true  avgt   10   451.024 ±   7.468  ns/op
QueueAsPoolBurstCost.acquireAndRelease           10       132000                         None      true  avgt   10    82.535 ±   0.862  ns/op
QueueAsPoolBurstCost.acquireAndRelease           10       132000           ArrayBlockingQueue      true  avgt   10  2884.411 ± 417.531  ns/op
QueueAsPoolBurstCost.acquireAndRelease           10       132000        ConcurrentLinkedQueue      true  avgt   10  5856.916 ± 439.367  ns/op
QueueAsPoolBurstCost.acquireAndRelease           10       132000  MpmcUnboundedXaddArrayQueue      true  avgt   10  4304.779 ± 597.593  ns/op
QueueAsPoolBurstCost.acquireAndRelease           10       132000               MpmcArrayQueue      true  avgt   10  4395.881 ± 452.933  ns/op

I have introduced a fake CPU consume method (likely in the wrong place, given that's not emulating what real users will do - meaning we should add some before acquire as well, likely) and numbers, as usual appear slightly different:

Benchmark                               (burstSize)  (qCapacity)                      (qType)  (warmup)  (work)  Mode  Cnt     Score    Error  Units
QueueAsPoolBurstCost.acquireAndRelease            1       132000                         None      true       0  avgt   10    16.139 ±  0.182  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000                         None      true      10  avgt   10    30.127 ±  0.382  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000                         None      true     100  avgt   10   275.163 ±  1.003  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000           ArrayBlockingQueue      true       0  avgt   10   317.676 ± 11.457  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000           ArrayBlockingQueue      true      10  avgt   10   367.082 ± 16.133  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000           ArrayBlockingQueue      true     100  avgt   10  1180.306 ± 40.263  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000        ConcurrentLinkedQueue      true       0  avgt   10   664.938 ± 19.703  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000        ConcurrentLinkedQueue      true      10  avgt   10   620.152 ± 26.502  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000        ConcurrentLinkedQueue      true     100  avgt   10   653.506 ± 18.155  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000  MpmcUnboundedXaddArrayQueue      true       0  avgt   10   392.326 ±  9.412  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000  MpmcUnboundedXaddArrayQueue      true      10  avgt   10   397.462 ±  4.952  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000  MpmcUnboundedXaddArrayQueue      true     100  avgt   10   461.849 ± 23.687  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000               MpmcArrayQueue      true       0  avgt   10   455.009 ± 10.099  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000               MpmcArrayQueue      true      10  avgt   10   456.501 ± 14.344  ns/op
QueueAsPoolBurstCost.acquireAndRelease            1       132000               MpmcArrayQueue      true     100  avgt   10   483.445 ±  4.849  ns/op

Where "high" work now make the ArrayBlockingQueue a way less appealing solution (as expected? by me at least), but how much "high" translate into real-world "work" is not clear yet and makes me guess if our queues are good performer for this use case in realistic cases.

I'm aware that under very high contention lock-free queues aren't the best performer(s), but I wasn't expecting to happen with just 3 thread(s), meaning that I'm doing something wrong or not considering other factors.

@nitsanw nitsanw merged commit 25c1a28 into JCTools:master Mar 2, 2023
@franz1981
Copy link
Collaborator Author

@nitsanw you didn't gave me your opinion!! :P:P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants