Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile allocations in benchmarks #40

Merged
merged 5 commits into from
Dec 13, 2023
Merged

Profile allocations in benchmarks #40

merged 5 commits into from
Dec 13, 2023

Conversation

nvzqz
Copy link
Owner

@nvzqz nvzqz commented Dec 11, 2023

This adds AllocProfiler, an allocator implementation that tracks the number of allocations and bytes allocated during benchmarks. It can wrap any GlobalAlloc, such as mimalloc.

Example benchmark output: Screen Shot 2023-12-11 at 05 34 17

@nvzqz nvzqz force-pushed the feat/alloc branch 5 times, most recently from 0ecb3df to 153d42e Compare December 11, 2023 20:26
This adds `AllocProfiler`, an allocator implementation that tracks the
number of allocations and bytes allocated during benchmarks. It can wrap
any `GlobalAlloc`, such as `mimalloc`.
@nvzqz nvzqz force-pushed the feat/alloc branch 4 times, most recently from 437fe5e to bcabfea Compare December 12, 2023 01:04
@nvzqz nvzqz force-pushed the feat/alloc branch 6 times, most recently from 2c8cfde to d34d708 Compare December 13, 2023 04:17
This generally performs better than `thread_local!`. A later commit will
optimize it further by looking up the pthread-specific variable in the
linear table whose address is stored in `gs` on x86_64 and `tpidrro_el0`
on AArch64.
This removes the need to implement reuse via `thread_local!` drop.
This does the same work as `pthread_{get,set}specific` by indexing into
the thread-specific data table via the address obtained from the
`gs`/`tpidrro_el0` register on x86_64/AArch64 respectively.

This change also aligns some atomic variables with a new `CachePadded`
type to prevent false sharing between threads of variables on the same
cache line.
@nvzqz
Copy link
Owner Author

nvzqz commented Dec 13, 2023

I believe I've pored over this code enough to be convinced that it's correct, so I'm going to merge it. Also, I can't come up with ways to make it faster without spending much more time.

@nvzqz nvzqz merged commit 4ad26a5 into main Dec 13, 2023
32 checks passed
@nvzqz nvzqz deleted the feat/alloc branch December 13, 2023 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants