Adding benchmark for thread-local object pool use case #374
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a naive attempt to emulate a silly (but very soon quite frequent, believe me) use case: object pooling without making the acquired object to escape the consumer thread.
This is a pattern very frequent in HFT, but not only: as mentioned before, with Virtual Thread(s) and Loom, we need alternatives to cover for the lack of per-carrier thread locals (see https://github.com/FasterXML/jackson-core/blob/390472bbdf4722fe058f48bb0eff5865c8d20f73/src/main/java/com/fasterxml/jackson/core/util/BufferRecyclers.java#L36).
The recent answer from Heinz K. at https://mail.openjdk.org/pipermail/loom-dev/2023-February/005320.html
made me ask; we suggest users to use our mpmc qs as object pool, but...they are the right/decent tools for the job?
With 3 threads:
I have introduced a fake CPU consume method (likely in the wrong place, given that's not emulating what real users will do - meaning we should add some before acquire as well, likely) and numbers, as usual appear slightly different:
Where "high"
work
now make theArrayBlockingQueue
a way less appealing solution (as expected? by me at least), but how much "high" translate into real-world "work" is not clear yet and makes me guess if our queues are good performer for this use case in realistic cases.I'm aware that under very high contention lock-free queues aren't the best performer(s), but I wasn't expecting to happen with just 3 thread(s), meaning that I'm doing something wrong or not considering other factors.