-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
std: optimize hash_map probe loop condition #10350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std: optimize hash_map probe loop condition #10350
Conversation
See ziglang#10337 for context. In ziglang#10337 the `available` tracking fix necessitated an additional condition on the probe loop in both `getOrPut` and `getIndex` to prevent an infinite loop. Previously, this condition was implicit thanks to the guaranteed presence of a free slot. The new condition hurts the `HashMap` benchmarks (ziglang#10337 (comment)). This commit removes that extra condition on the loop. Instead, when probing, first check whether the "home" slot is the target key — if so, return it. Otherwise, save the home slot's metadata to the stack and temporarily "free" the slot (but don't touch its value). Then continue with the original loop. Once again, the loop will be implicitly broken by the new "free" slot. The original metadata is restored before the function returns. `getOrPut` has one additional gotcha — if the home slot is a tombstone and `getOrPut` misses, then the home slot is is written with the new key; that is, its original metadata (the tombstone) is not restored. Other changes: - Test hash map misses. - Test using `getOrPutAssumeCapacity` to get keys at the end (along with `get`).
c4d6b84 to
3861c96
Compare
|
Just pushed a fix for the failing test at https://ci.ziglang.org/ziglang/zig/859/1/4 (due to mixing up usize and u64). |
|
Thanks for the follow-up. Were you able to observe any perf improvements with these changes? |
|
I tested locally with just |
|
Good to know, thanks. I'm fine with merging this and giving it a spin in our perf tracking dashboard. But if it remains inconclusive then it probably makes sense to revert since this is slightly more complex than before. |
|
Sounds good! |
|
My observations:
Seems like it should be reverted to me. |
|
Agreed; I'll dig into it more on Monday. |
|
reverted in 6d04de7 |
In #10337 the
availabletracking fix necessitated an additional condition on the probe loop in bothgetOrPutandgetIndexto prevent an infinite loop. Previously, this condition was implicit thanks to the guaranteed presence of a free slot.The new condition hurts the
HashMapbenchmarks (#10337 (comment)).This commit removes that extra condition on the loop. Instead, when probing, first check whether the "home" slot is the target key — if so, return it. Otherwise, save the home slot's metadata to the stack and temporarily "free" the slot (but don't touch its value). Then continue with the original loop. Once again, the loop will be implicitly broken by the new "free" slot. The original metadata is restored before the function returns.
getOrPuthas one additional gotcha — if the home slot is a tombstone andgetOrPutmisses, then the home slot is is written with the new key; that is, its original metadata (the tombstone) is not restored.Other changes:
getOrPutAssumeCapacityto get keys at the end (along withget).This approach is inspired by @jorangreef's suggested optimization (#10337 (comment)).
But that doesn't quite work; an insert/remove cycle will eventually convert any free slots to tombstones. Joran and I considered a couple other approaches before discovering the temporarily-free-a-slot tactic:
[A;B;C]with a max load of 2/3. Put A, Put B, Free A, Put C, and now it's back to needing a limit condition. (This may still be worth doing in the future to reduce the likelihood of a Miss'sO(n)worst-case, even though it isn't flawless. Either that or rehash-in-place.)cc: @Sahnvour