-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimize SubstringSetMatcher [patch 3/5, replace flat_map]
Replace the base::flat_map<char, NodeID> with a somewhat more custom data structure. This is significantly tighter on memory, while also increasing performance. The full details are in the comments, but in general, we pack the label and a 23-bit node ID together in a 32-bit block, which immediately halves the RAM used for edges (pair<char, NodeID> used 64 bits, due to padding). Furthermore, we also reduce the size of each node, by implementing our own smaller size and capacity counters; no node can have more than 259 outgoing edges, so a uint32_t is meaningless for this. Finally, since most nodes have very few edges, we add a special case for when there are two edges or fewer, where we store those edges inline in the node instead of on the heap. This saves on RAM and memory allocation time, plus makes for less pointer cahsing. Note that this changes node IDs to be 23-bit instead of 32-bit, which means we can hold 8M nodes instead of 4B. This is still tens of megabytes of data in practice, though; if it turns out to be a problem for some applications, it would probably be possible to have a template parameter for 31-bit IDs (causing Node to go from 12 to 14 bytes in the process). We stop doing binary search and replace it with a simple linear one instead, which is just as fast (since we generally have few edges) and allows us to get by without sorting nodes. SubstringSetMatcher.init_time: 60956 -> 36772 us (+65.8% perf) SubstringSetMatcher.match_time: 138 -> 129 us (+ 7.0% perf) SubstringSetMatcher.memory_usage: 25879 -> 13047 kB (-49.6% RAM) Change-Id: I9ef6b24e5d20dff10a7736086584372d1c92c636 Bug: 1319422 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3596141 Commit-Queue: Steinar H Gunderson <sesse@chromium.org> Reviewed-by: Dominic Battré <battre@chromium.org> Cr-Commit-Position: refs/heads/main@{#1001467}
- Loading branch information
Steinar H. Gunderson
authored and
Chromium LUCI CQ
committed
May 10, 2022
1 parent
010704b
commit 2f0ae63
Showing
2 changed files
with
198 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters