-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Compiled" static header maps instead of big trie #33932
Changes from 29 commits
9b0d62c
2e83788
be5dd61
611e1dd
150040e
90ba622
e055146
dd3a9fa
1990767
5df1270
92625f3
23042de
0808002
b293819
9eccaa9
a3f8919
d1a8ed8
7ece7eb
57b9055
facdef8
941d3fa
d75190d
30bab4b
5469839
dc91907
4348681
9bd6190
9ee31c0
405ac4d
a743ba2
c7458f4
c4858ec
56c5e4e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,235 @@ | ||
#pragma once | ||
|
||
#include <algorithm> | ||
#include <array> | ||
#include <string> | ||
#include <vector> | ||
|
||
#include "envoy/common/pure.h" | ||
|
||
#include "absl/strings/string_view.h" | ||
|
||
namespace Envoy { | ||
|
||
/** | ||
* This is a specialized structure intended for static header maps, but | ||
* there may be other use cases. | ||
* The structure is: | ||
* 1. a length-based lookup table so only keys the same length as the | ||
* target key are considered. | ||
* 2. a trie that branches on the "most divisions" position of the key. | ||
* | ||
* Using this structure is not recommended unless the table is | ||
* initialize-once, use-many, as the "compile" operation is expensive. | ||
* | ||
* Unlike a regular trie, this structure cannot be used for prefix-based | ||
* matching. | ||
* | ||
* For example, if we consider the case where the set of headers is | ||
* `x-prefix-banana` | ||
* `x-prefix-babana` | ||
* `x-prefix-apple` | ||
* `x-prefix-pineapple` | ||
* `x-prefix-barana` | ||
* `x-prefix-banaka` | ||
* | ||
* A standard front-first trie looking for `x-prefix-banana` would walk | ||
* 7 nodes through the tree, first for `x`, then for `-`, etc. | ||
* | ||
* This structure first jumps to matching length, eliminating in this | ||
* example case apple and pineapple. | ||
* Then the "best split" node is on | ||
* `x-prefix-banana` | ||
* ^ | ||
* so the first node has 3 non-miss branches, n, b and r for that position. | ||
* Down that n branch, the "best split" is on | ||
* `x-prefix-banana` | ||
* ^ | ||
* which has two branches, n or k. | ||
* Down the n branch is the leaf node (only `x-prefix-banana` remains) - at | ||
* this point a regular string-compare checks if the key is an exact match | ||
* for the string node. | ||
*/ | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
template <class Value> class CompiledStringMap { | ||
class Node { | ||
public: | ||
// While it is usual to take a string_view by value, in this | ||
// performance-critical context with repeatedly passing the same | ||
// value, passing it by reference benchmarks out slightly faster. | ||
virtual Value find(const absl::string_view& key) PURE; | ||
virtual ~Node() = default; | ||
}; | ||
|
||
class LeafNode : public Node { | ||
public: | ||
LeafNode(absl::string_view key, Value&& value) : key_(key), value_(std::move(value)) {} | ||
Value find(const absl::string_view& key) override { | ||
// String comparison unnecessarily checks size equality first, we can skip | ||
// to memcmp here because we already know the sizes are equal. | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Since this is a super-hot path we don't even ASSERT here, to avoid adding | ||
// slowdown in debug builds. | ||
if (memcmp(key.data(), key_.data(), key.size())) { | ||
return {}; | ||
} | ||
return value_; | ||
} | ||
|
||
private: | ||
const std::string key_; | ||
const Value value_; | ||
}; | ||
|
||
class BranchNode : public Node { | ||
public: | ||
BranchNode(size_t index, uint8_t min, std::vector<std::unique_ptr<Node>>&& branches) | ||
: index_(index), min_(min), branches_(std::move(branches)) {} | ||
Value find(const absl::string_view& key) override { | ||
uint8_t k = static_cast<uint8_t>(key[index_]); | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Possible optimization was tried here, populating empty nodes with | ||
// a function that returns {} to reduce branching vs checking for null | ||
// nodes. Checking for null nodes benchmarked faster. | ||
if (k < min_ || k >= min_ + branches_.size() || branches_[k - min_] == nullptr) { | ||
return {}; | ||
} | ||
return branches_[k - min_]->find(key); | ||
} | ||
|
||
private: | ||
const size_t index_; | ||
const uint8_t min_; | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Possible optimization was tried here, using std::array<std::unique_ptr<Node>, 256> | ||
// rather than a smaller-range vector with bounds, to keep locality and reduce | ||
// comparisons. It didn't help. | ||
const std::vector<std::unique_ptr<Node>> branches_; | ||
}; | ||
|
||
public: | ||
// The caller owns the string-views during `compile`. Ownership of the passed in | ||
// Values is transferred to the CompiledStringMap. | ||
using KV = std::pair<absl::string_view, Value>; | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/** | ||
* Returns the value with a matching key, or the default value | ||
* (typically nullptr) if the key was not present. | ||
* @param key the key to look up. | ||
*/ | ||
Value find(absl::string_view key) const { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you found passing by const ref was faster in another case, but not here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mostly left this one for the sake of leaving the external API unchanged. Feels more reasonable to buck the style guide for internal implementation than for API. |
||
const size_t key_size = key.size(); | ||
if (key_size >= table_.size() || table_[key_size] == nullptr) { | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return {}; | ||
} | ||
return table_[key_size]->find(key); | ||
}; | ||
/** | ||
* Construct the lookup table. This can be a somewhat slow multi-pass | ||
* operation if the input table is large. | ||
* @param initial a vector of key->value pairs. This is taken by value because | ||
* we're going to modify it. If the caller still wants the original | ||
* then it can be copied in, if not it can be moved in. | ||
* Note that the keys are string_views - the base string data must | ||
* exist for the duration of compile(). The leaf nodes take copies | ||
* of the key strings, so the string_views can be invalidated once | ||
* compile has completed. | ||
*/ | ||
void compile(std::vector<KV> initial) { | ||
jmarantz marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the parameter called
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if (initial.empty()) { | ||
return; | ||
} | ||
std::sort(initial.begin(), initial.end(), | ||
[](const KV& a, const KV& b) { return a.first.size() < b.first.size(); }); | ||
size_t longest = initial.back().first.size(); | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
table_.resize(longest + 1); | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
auto range_start = initial.begin(); | ||
// Populate the sub-nodes for each length of key that exists. | ||
while (range_start != initial.end()) { | ||
// Find the first key whose length differs from the current key length. | ||
// Everything in between is keys with the same length. | ||
auto range_end = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: const |
||
std::find_if(range_start, initial.end(), [len = range_start->first.size()](const KV& e) { | ||
return e.first.size() != len; | ||
}); | ||
std::vector<KV> node_contents; | ||
// Populate a FindFn for the nodes in that range. | ||
node_contents.reserve(range_end - range_start); | ||
std::move(range_start, range_end, std::back_inserter(node_contents)); | ||
table_[range_start->first.size()] = createEqualLengthNode(node_contents); | ||
range_start = range_end; | ||
} | ||
} | ||
|
||
private: | ||
/** | ||
* Details of a node branch point; the index into the string at which | ||
* characters should be looked up, the lowest valued character in the | ||
* branch, the highest valued character in the branch, and how many | ||
* branches there are. | ||
*/ | ||
struct IndexSplitInfo { | ||
uint8_t index_, min_, max_, count_; | ||
size_t size() { return max_ - min_ + 1; } | ||
size_t offsetOf(uint8_t c) { return c - min_; } | ||
}; | ||
|
||
/** | ||
* @param node_contents the key-value pairs to be branched upon. | ||
* @return details of the index on which the node should branch | ||
* - the index which produces the most child branches. | ||
*/ | ||
static IndexSplitInfo findBestSplitPoint(const std::vector<KV>& node_contents) { | ||
IndexSplitInfo best{0, 0, 0, 0}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding an ASSERT(node_contents.size() > 1); |
||
for (size_t i = 0; i < node_contents[0].first.size(); i++) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would this be faster?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it would (it's just reading a value from an address), and we don't care about the microoptimization performance here anyway. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to make this code more understandable consider naming: consider renaming: |
||
std::array<bool, 256> hits{}; | ||
IndexSplitInfo info{static_cast<uint8_t>(i), 255, 0, 0}; | ||
for (size_t j = 0; j < node_contents.size(); j++) { | ||
uint8_t v = node_contents[j].first[i]; | ||
if (!hits[v]) { | ||
hits[v] = true; | ||
info.count_++; | ||
info.min_ = std::min(v, info.min_); | ||
info.max_ = std::max(v, info.max_); | ||
} | ||
} | ||
if (info.count_ > best.count_) { | ||
best = info; | ||
} | ||
} | ||
return best; | ||
} | ||
|
||
/* | ||
* @param node_contents the set of key-value pairs that will be children of | ||
* this node. | ||
* @return the recursively generated tree node that leads to all of node_contents. | ||
* If there is only one entry in node_contents then a LeafNode, otherwise a BranchNode. | ||
*/ | ||
static std::unique_ptr<Node> createEqualLengthNode(std::vector<KV> node_contents) { | ||
if (node_contents.size() == 1) { | ||
return std::make_unique<LeafNode>(node_contents[0].first, std::move(node_contents[0].second)); | ||
} | ||
IndexSplitInfo best = findBestSplitPoint(node_contents); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: const
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
std::vector<std::unique_ptr<Node>> nodes; | ||
nodes.resize(best.size()); | ||
std::sort(node_contents.begin(), node_contents.end(), | ||
[index = best.index_](const KV& a, const KV& b) { | ||
return a.first[index] < b.first[index]; | ||
}); | ||
auto range_start = node_contents.begin(); | ||
// Populate the sub-nodes for each character-branch. | ||
while (range_start != node_contents.end()) { | ||
// Find the first key whose character at position [best.index_] differs from the | ||
// character of the current range. | ||
// Everything in the range has keys with the same character at this index. | ||
auto range_end = std::find_if(range_start, node_contents.end(), | ||
[index = best.index_, c = range_start->first[best.index_]]( | ||
const KV& e) { return e.first[index] != c; }); | ||
std::vector<KV> next_contents; | ||
ravenblackx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
next_contents.reserve(range_end - range_start); | ||
std::move(range_start, range_end, std::back_inserter(next_contents)); | ||
nodes[best.offsetOf(range_start->first[best.index_])] = createEqualLengthNode(next_contents); | ||
range_start = range_end; | ||
} | ||
return std::make_unique<BranchNode>(best.index_, best.min_, std::move(nodes)); | ||
} | ||
std::vector<std::unique_ptr<Node>> table_; | ||
}; | ||
|
||
} // namespace Envoy |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
#include "source/common/common/compiled_string_map.h" | ||
|
||
#include "gmock/gmock.h" | ||
#include "gtest/gtest.h" | ||
|
||
namespace Envoy { | ||
|
||
using testing::IsNull; | ||
|
||
TEST(CompiledStringMapTest, FindsEntriesCorrectly) { | ||
CompiledStringMap<const char*> map; | ||
map.compile({ | ||
{"key-1", "value-1"}, | ||
{"key-2", "value-2"}, | ||
{"longer-key", "value-3"}, | ||
{"bonger-key", "value-4"}, | ||
{"bonger-bey", "value-5"}, | ||
{"only-key-of-this-length", "value-6"}, | ||
}); | ||
EXPECT_EQ(map.find("key-1"), "value-1"); | ||
EXPECT_EQ(map.find("key-2"), "value-2"); | ||
EXPECT_THAT(map.find("key-0"), IsNull()); | ||
EXPECT_THAT(map.find("key-3"), IsNull()); | ||
EXPECT_EQ(map.find("longer-key"), "value-3"); | ||
EXPECT_EQ(map.find("bonger-key"), "value-4"); | ||
EXPECT_EQ(map.find("bonger-bey"), "value-5"); | ||
EXPECT_EQ(map.find("only-key-of-this-length"), "value-6"); | ||
EXPECT_THAT(map.find("songer-key"), IsNull()); | ||
EXPECT_THAT(map.find("absent-length-key"), IsNull()); | ||
} | ||
|
||
TEST(CompiledStringMapTest, EmptyMapReturnsNull) { | ||
CompiledStringMap<const char*> map; | ||
map.compile({}); | ||
EXPECT_THAT(map.find("key-1"), IsNull()); | ||
} | ||
|
||
} // namespace Envoy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the "best split" can be anywhere in the string. I think modifying the example to the following, will clarify how this should work.
then the best split would've been in the character before-last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be anywhere in the string, only at an index where there's a difference. The example is specifically showing that 3 branches would be "best" over 2 branches. I don't understand what your example clarifies that the existing example doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was looking at it from a "trie" perspective.
When I first read the comment, I thought that in each branch, the structure only keeps the "suffix" (after the branch point). I think that updating the example to a case where the branching happens due to a "best-split" that is not the first split, will reduce that confusion to future readers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully the md file explanation covers this sufficiently (though it doesn't do the second split earlier in the string).
If you'd like me to rearrange the strings in the new example so that it does the "last" index split first and the earlier one second I can do that, but I think the diagram having numbers in it probably makes it clear enough. (And late first would be just as potentially confusing as early first, and having three sequential splits would be getting so long as to be harder to read...)