[compiler] use perfect hashes for switches with small strings #506

quasilyte · 2022-04-26T14:30:53Z

Previously, we compiled a string switch as follow:

1. compute a tag (condition) string hash
2. switch over precomputed hashes of cases
3. inside a switch case, do `equals()` for a string

So, we compile a switch as a 2-level comparison: hash matching
plus equals() call on a string.

For shorter strings we can avoid the third step by using a
hash that will not result in collisions and identifies a string completely.

Since we're using uint64_t for switch hashes, we have 8 bits for information.
We can store up to 8 chars inside that without any losses.
Our hash function will be more or less memcpy(&h, s, s_len).

Hash of a single char becomes a value in 0 <= x <= 255 range, which
is great for C++ switch density: it's likely to get that switch compiled
as a jump table (useful for lexers, etc).

Another common case is a shared prefix of all case values.

case 'OPERATION_DELETE'
case 'OPERATION_INSERT'
case 'OPERATION_UPDATE'

All cases above have a length of 16 that is too much for our new hashing.
But if we remove the common OPERATION_ prefix, strings fit into 6 bytes.

When we have such switch with common prefixes and varying suffixes that
fit into 8 bytes, we compare the prefix before hashing, if it does match,
variable cases tail is hashed (DELETE, INSERT, UPDATE from above).

Since optimized form doesn't require extra if statement inside every case,
the binary size gets smaller as well (but not dramatically).

In average, we compile 60% of string switches using the optimized scheme.
(~50% for short strings plus around ~10% for prefix form)

Benchmark results:

name                                        old time/op  new time/op  delta
SwitchOnlyDefault                           70.0ns ± 0%  29.0ns ± 0%  -58.57%
Lexer                                        119ns ± 1%    78ns ± 2%  -35.04%
LexerMiss                                   87.0ns ± 0%  77.0ns ± 0%  -11.49%
LexerComplex                                 115ns ± 1%    93ns ± 2%  -19.43%
LexerComplexMiss                            87.0ns ± 0%  74.6ns ± 1%  -14.25%
Switch8oneChar                               158ns ± 0%    95ns ± 1%  -39.81%
Switch8oneCharMiss                           108ns ± 1%   102ns ± 1%   -5.48%
Switch8oneCharNoDefault                      164ns ± 1%    87ns ± 1%  -46.67%
Switch8oneCharNoDefaultMiss                  117ns ± 0%    97ns ± 1%  -17.35%
Switch6perfectHash                           258ns ± 1%   204ns ± 1%  -21.19%
Switch6perfectHashMiss                       130ns ± 0%   119ns ± 1%   -8.21%
Switch6perfectHashNoDefault                  271ns ± 1%   193ns ± 1%  -28.72%
Switch6perfectHashNoDefaultMiss              138ns ± 1%   115ns ± 1%  -16.58%
Switch12perfectHash                          278ns ± 0%   194ns ± 4%  -30.19%
Switch12perfectHashMiss                      139ns ± 1%   109ns ± 1%  -21.62%
Switch12perfectHashNoDefault                 275ns ± 1%   190ns ± 0%  -30.74%
Switch12perfectHashNoDefaultMiss             142ns ± 1%   108ns ± 0%  -24.05%
Switch6perfectHashWithPrefix                 286ns ± 0%   219ns ± 3%  -23.40%
Switch6perfectHashWithPrefixMiss             190ns ± 1%   143ns ± 2%  -24.67%
Switch6perfectHashWithPrefixNoDefault        293ns ± 0%   230ns ± 1%  -21.58%
Switch6perfectHashWithPrefixNoDefaultMiss    194ns ± 1%   153ns ± 2%  -20.82%
Switch12perfectHashWithPrefix                303ns ± 1%   225ns ± 2%  -25.69%
Switch12perfectHashWithPrefixMiss            159ns ± 0%   113ns ± 0%  -28.93%
Switch12perfectHashWithPrefixNoDefault       308ns ± 1%   224ns ± 1%  -27.18%
Switch12perfectHashWithPrefixNoDefaultMiss   141ns ± 1%   117ns ± 2%  -17.14%
[Geo mean]                                   171ns        130ns       -23.93%

Another change is that we now compile a default-only switch a little bit differently.
We don't compute a hash and discard auto-generated variables to avoid C++ compiler warnings.

Refs #418
Fixes #503

compiler/code-gen/vertex-compiler.cpp

Previously, we compiled a string switch as follow: 1. compute a tag (condition) string hash 2. switch over precomputed hashes of cases 3. inside a switch case, do `equals()` for a string So, we compile a switch as a 2-level comparison: hash matching plus `equals()` call on a string. For shorter strings we can avoid the third step by using a hash that will not result in collisions and identifies a string completely. Since we're using uint64_t for switch hashes, we have 8 bits for information. We can store up to 8 chars inside that without any losses. Our hash function will be more or less `memcpy(&h, s, s_len)`. Hash of a single char becomes a value in `0 <= x <= 255` range, which is great for C++ switch density: it's likely to get that switch compiled as a jump table (useful for lexers, etc). Another common case is a shared prefix of all case values. case 'OPERATION_DELETE' case 'OPERATION_INSERT' case 'OPERATION_UPDATE' All cases above have a length of 16 that is too much for our new hashing. But if we remove the common `OPERATION_` prefix, strings fit into 6 bytes. When we have such switch with common prefixes and varying suffixes that fit into 8 bytes, we compare the prefix before hashing, if it does match, variable cases tail is hashed (`DELETE`, `INSERT`, `UPDATE` from above). Since optimized form doesn't require extra if statement inside every case, the binary size gets smaller as well (but not dramatically). In average, we compile 60% of string switches using the optimized scheme. (~50% for short strings plus around ~10% for prefix form) Benchmark results: name old time/op new time/op delta SwitchOnlyDefault 70.0ns ± 0% 29.0ns ± 0% -58.57% Lexer 119ns ± 1% 78ns ± 2% -35.04% LexerMiss 87.0ns ± 0% 77.0ns ± 0% -11.49% LexerComplex 115ns ± 1% 93ns ± 2% -19.43% LexerComplexMiss 87.0ns ± 0% 74.6ns ± 1% -14.25% Switch8oneChar 158ns ± 0% 95ns ± 1% -39.81% Switch8oneCharMiss 108ns ± 1% 102ns ± 1% -5.48% Switch8oneCharNoDefault 164ns ± 1% 87ns ± 1% -46.67% Switch8oneCharNoDefaultMiss 117ns ± 0% 97ns ± 1% -17.35% Switch6perfectHash 258ns ± 1% 204ns ± 1% -21.19% Switch6perfectHashMiss 130ns ± 0% 119ns ± 1% -8.21% Switch6perfectHashNoDefault 271ns ± 1% 193ns ± 1% -28.72% Switch6perfectHashNoDefaultMiss 138ns ± 1% 115ns ± 1% -16.58% Switch12perfectHash 278ns ± 0% 194ns ± 4% -30.19% Switch12perfectHashMiss 139ns ± 1% 109ns ± 1% -21.62% Switch12perfectHashNoDefault 275ns ± 1% 190ns ± 0% -30.74% Switch12perfectHashNoDefaultMiss 142ns ± 1% 108ns ± 0% -24.05% Switch6perfectHashWithPrefix 286ns ± 0% 219ns ± 3% -23.40% Switch6perfectHashWithPrefixMiss 190ns ± 1% 143ns ± 2% -24.67% Switch6perfectHashWithPrefixNoDefault 293ns ± 0% 230ns ± 1% -21.58% Switch6perfectHashWithPrefixNoDefaultMiss 194ns ± 1% 153ns ± 2% -20.82% Switch12perfectHashWithPrefix 303ns ± 1% 225ns ± 2% -25.69% Switch12perfectHashWithPrefixMiss 159ns ± 0% 113ns ± 0% -28.93% Switch12perfectHashWithPrefixNoDefault 308ns ± 1% 224ns ± 1% -27.18% Switch12perfectHashWithPrefixNoDefaultMiss 141ns ± 1% 117ns ± 2% -17.14% [Geo mean] 171ns 130ns -23.93% Another change is that we now compile a default-only switch a little bit differently. We don't compute a hash and discard auto-generated variables to avoid C++ compiler warnings. Refs #418 Fixes #503

quasilyte force-pushed the quasilyte/stringswitch_opt branch from fbebcaa to d7a479d Compare April 26, 2022 14:49

AlexK0 reviewed Apr 26, 2022

View reviewed changes

compiler/code-gen/vertex-compiler.cpp Outdated Show resolved Hide resolved

compiler/code-gen/vertex-compiler.cpp Outdated Show resolved Hide resolved

quasilyte force-pushed the quasilyte/stringswitch_opt branch from d7a479d to aa69b74 Compare April 26, 2022 15:13

quasilyte force-pushed the quasilyte/stringswitch_opt branch from aa69b74 to 579f430 Compare April 26, 2022 15:19

unserialize added the optimization Memory comsumption / CPU speedup label Jun 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[compiler] use perfect hashes for switches with small strings #506

[compiler] use perfect hashes for switches with small strings #506

quasilyte commented Apr 26, 2022

[compiler] use perfect hashes for switches with small strings #506

Are you sure you want to change the base?

[compiler] use perfect hashes for switches with small strings #506

Conversation

quasilyte commented Apr 26, 2022