Skip to content

Conversation

@polytypic
Copy link
Contributor

@polytypic polytypic commented May 5, 2025

Queues and stacks were previously benchmarked with immediate values only, which unfortunately partially hides the potential cost of write barriers related to setting and clearing elements.

Partially hiding the cost of write barriers makes the queue benchmarks unrealistic. Queues and stack are very rarely used with only immediate values.

@polytypic polytypic force-pushed the bench-queues-and-stacks-with-heap-blocks branch 2 times, most recently from a24b04d to efb17fa Compare May 5, 2025 08:20
@polytypic polytypic requested a review from lyrm May 5, 2025 08:27
@polytypic
Copy link
Contributor Author

polytypic commented May 5, 2025

I happened to just realize that benchmarking data structures with immediate values gives potentially unrealistic results in OCaml. The problem is that the write barrier for immediate values is much faster than for non-immediate values and that it is relatively rare to actually use data structures with only immediate values.

I developed a new queue that appeared much faster in benchmarks than some previous queues. However, once put into more realistic use, it performed poorly. Now, after months, I finally realized the reason: the real use case exposed the cost of the write barriers.

I would strongly recommend that the benchmarks of queues and stacks here be changed to use heap allocated elements. Otherwise the benchmark results may be misleading (for most use cases) and may lead to poor choice of data structures and implementation techniques.

@polytypic polytypic force-pushed the bench-queues-and-stacks-with-heap-blocks branch from efb17fa to 157accf Compare May 5, 2025 10:24
@lyrm
Copy link
Collaborator

lyrm commented May 21, 2025

Thanks for the PR. I am reviewing it today. Also, I am working on fixing the CI issue.

Copy link
Collaborator

@lyrm lyrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve fixed the CI issue. Could you rebase to check if it’s working on your side as well?

Thanks again for the PR! Have you run any benchmarks to see if this changes how the different queues compare to each other ? What about the two-stack queue for example ?

Queues and stacks were previously benchmarked with immediate values only, which
unfortunately partially hides the potential cost of write barriers related to
setting and clearing elements.

Partially hiding the cost of write barriers makes the queue benchmarks
unrealistic.  Queues and stack are very rarely used with only immediate values.
@polytypic polytypic force-pushed the bench-queues-and-stacks-with-heap-blocks branch from 157accf to ae94ae5 Compare May 22, 2025 10:26
@polytypic
Copy link
Contributor Author

Have you run any benchmarks to see if this changes how the different queues compare to each other ?

I changed the benchmarks in multicore-bench and picos to allocate the elements just like in this PR.

What about the two-stack queue for example ?

The two-stack queue doesn't need to clear/set values, which means that allocating the elements only adds the allocation cost.

@polytypic polytypic merged commit 2a653b6 into main May 23, 2025
4 checks passed
@polytypic polytypic deleted the bench-queues-and-stacks-with-heap-blocks branch May 23, 2025 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants