Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Compiled" static header maps instead of big trie #33932
"Compiled" static header maps instead of big trie #33932
Changes from all commits
9b0d62c
2e83788
be5dd61
611e1dd
150040e
90ba622
e055146
dd3a9fa
1990767
5df1270
92625f3
23042de
0808002
b293819
9eccaa9
a3f8919
d1a8ed8
7ece7eb
57b9055
facdef8
941d3fa
d75190d
30bab4b
5469839
dc91907
4348681
9bd6190
9ee31c0
405ac4d
a743ba2
c7458f4
c4858ec
56c5e4e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you found passing by const ref was faster in another case, but not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly left this one for the sake of leaving the external API unchanged. Feels more reasonable to buck the style guide for internal implementation than for API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding an ASSERT(node_contents.size() > 1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a minor comment (no AI):
I think the original trie can be converted to using the same "shrinked-range" (only the needed range of characters) technique. So in terms of memory I think this data-structure is more expensive.
That said, from the CPU perf point-of-view, I tend to agree that this data-structure should perform better, although it is not easy for me to say which of the changes in the data-structure (using per-length buckets, splitting not at the prefix vs condensing a trie) provide the most benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing just shrunk-range was actually the original PR from which this one forked off as a prerequisite; it made performance a bit worse (more so than this version) for the miss case and made little difference for the hits (worse for short strings, better for long strings). I think this version is also still better memory-wise, because every pointer is 64 bits, and this trie has far fewer pointers.
Consider the case of even just one entry,
x-envoy-banana
- in this model it's the root length node and a leaf node, so it's a vector with 13 nullptrs and one pointer (because the length node doesn't do the minimizing thing), a length, and one leaf node containing a value and a string.In a regular "shrunk" trie, this is 14 nodes each containing a default value (a pointer) and a pointer to the next node, and a vector of chars one character long and a length - so that's 48 pointers, which is significantly larger than the "short path" trie. Even if you cut out the common "x-envoy-" prefix entirely and assume that's shared among so many headers that we can amortize it entirely, it's still 18 pointers just for the 'banana' part of the sequence, which is about the same as the "compiled" trie for the whole thing (which also theoretically shares "common prefixes"!)
Splitting at a more optimal branch point is definitely the main performance enhancer for hits. (It's similar to what gperf does.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are correct.
I keep forgetting that this is not a compressed-prefix-trie, but one that has per-character branching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, a compressed-prefix trie (assuming this is a name for "a branch is per-character but also matches the longest string common to all its children") would probably be nearly as fast as the "branch at best position" trie, and a similar size, maybe slightly smaller, since it would also only be a few hops to completion. But it would involve more string-compares.