New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler] use perfect hashes for switches with small strings #506
Open
quasilyte
wants to merge
1
commit into
master
Choose a base branch
from
quasilyte/stringswitch_opt
base: master
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
quasilyte
force-pushed
the
quasilyte/stringswitch_opt
branch
from
April 26, 2022 14:49
fbebcaa
to
d7a479d
Compare
AlexK0
reviewed
Apr 26, 2022
quasilyte
force-pushed
the
quasilyte/stringswitch_opt
branch
from
April 26, 2022 15:13
d7a479d
to
aa69b74
Compare
Previously, we compiled a string switch as follow: 1. compute a tag (condition) string hash 2. switch over precomputed hashes of cases 3. inside a switch case, do `equals()` for a string So, we compile a switch as a 2-level comparison: hash matching plus `equals()` call on a string. For shorter strings we can avoid the third step by using a hash that will not result in collisions and identifies a string completely. Since we're using uint64_t for switch hashes, we have 8 bits for information. We can store up to 8 chars inside that without any losses. Our hash function will be more or less `memcpy(&h, s, s_len)`. Hash of a single char becomes a value in `0 <= x <= 255` range, which is great for C++ switch density: it's likely to get that switch compiled as a jump table (useful for lexers, etc). Another common case is a shared prefix of all case values. case 'OPERATION_DELETE' case 'OPERATION_INSERT' case 'OPERATION_UPDATE' All cases above have a length of 16 that is too much for our new hashing. But if we remove the common `OPERATION_` prefix, strings fit into 6 bytes. When we have such switch with common prefixes and varying suffixes that fit into 8 bytes, we compare the prefix before hashing, if it does match, variable cases tail is hashed (`DELETE`, `INSERT`, `UPDATE` from above). Since optimized form doesn't require extra if statement inside every case, the binary size gets smaller as well (but not dramatically). In average, we compile 60% of string switches using the optimized scheme. (~50% for short strings plus around ~10% for prefix form) Benchmark results: name old time/op new time/op delta SwitchOnlyDefault 70.0ns ± 0% 29.0ns ± 0% -58.57% Lexer 119ns ± 1% 78ns ± 2% -35.04% LexerMiss 87.0ns ± 0% 77.0ns ± 0% -11.49% LexerComplex 115ns ± 1% 93ns ± 2% -19.43% LexerComplexMiss 87.0ns ± 0% 74.6ns ± 1% -14.25% Switch8oneChar 158ns ± 0% 95ns ± 1% -39.81% Switch8oneCharMiss 108ns ± 1% 102ns ± 1% -5.48% Switch8oneCharNoDefault 164ns ± 1% 87ns ± 1% -46.67% Switch8oneCharNoDefaultMiss 117ns ± 0% 97ns ± 1% -17.35% Switch6perfectHash 258ns ± 1% 204ns ± 1% -21.19% Switch6perfectHashMiss 130ns ± 0% 119ns ± 1% -8.21% Switch6perfectHashNoDefault 271ns ± 1% 193ns ± 1% -28.72% Switch6perfectHashNoDefaultMiss 138ns ± 1% 115ns ± 1% -16.58% Switch12perfectHash 278ns ± 0% 194ns ± 4% -30.19% Switch12perfectHashMiss 139ns ± 1% 109ns ± 1% -21.62% Switch12perfectHashNoDefault 275ns ± 1% 190ns ± 0% -30.74% Switch12perfectHashNoDefaultMiss 142ns ± 1% 108ns ± 0% -24.05% Switch6perfectHashWithPrefix 286ns ± 0% 219ns ± 3% -23.40% Switch6perfectHashWithPrefixMiss 190ns ± 1% 143ns ± 2% -24.67% Switch6perfectHashWithPrefixNoDefault 293ns ± 0% 230ns ± 1% -21.58% Switch6perfectHashWithPrefixNoDefaultMiss 194ns ± 1% 153ns ± 2% -20.82% Switch12perfectHashWithPrefix 303ns ± 1% 225ns ± 2% -25.69% Switch12perfectHashWithPrefixMiss 159ns ± 0% 113ns ± 0% -28.93% Switch12perfectHashWithPrefixNoDefault 308ns ± 1% 224ns ± 1% -27.18% Switch12perfectHashWithPrefixNoDefaultMiss 141ns ± 1% 117ns ± 2% -17.14% [Geo mean] 171ns 130ns -23.93% Another change is that we now compile a default-only switch a little bit differently. We don't compute a hash and discard auto-generated variables to avoid C++ compiler warnings. Refs #418 Fixes #503
quasilyte
force-pushed
the
quasilyte/stringswitch_opt
branch
from
April 26, 2022 15:19
aa69b74
to
579f430
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, we compiled a string switch as follow:
So, we compile a switch as a 2-level comparison: hash matching
plus
equals()
call on a string.For shorter strings we can avoid the third step by using a
hash that will not result in collisions and identifies a string completely.
Since we're using uint64_t for switch hashes, we have 8 bits for information.
We can store up to 8 chars inside that without any losses.
Our hash function will be more or less
memcpy(&h, s, s_len)
.Hash of a single char becomes a value in
0 <= x <= 255
range, whichis great for C++ switch density: it's likely to get that switch compiled
as a jump table (useful for lexers, etc).
Another common case is a shared prefix of all case values.
All cases above have a length of 16 that is too much for our new hashing.
But if we remove the common
OPERATION_
prefix, strings fit into 6 bytes.When we have such switch with common prefixes and varying suffixes that
fit into 8 bytes, we compare the prefix before hashing, if it does match,
variable cases tail is hashed (
DELETE
,INSERT
,UPDATE
from above).Since optimized form doesn't require extra if statement inside every case,
the binary size gets smaller as well (but not dramatically).
In average, we compile 60% of string switches using the optimized scheme.
(~50% for short strings plus around ~10% for prefix form)
Benchmark results:
Another change is that we now compile a default-only switch a little bit differently.
We don't compute a hash and discard auto-generated variables to avoid C++ compiler warnings.
Refs #418
Fixes #503