Make TrieLookupTable destruction non-recursive #34783

ravenblackx · 2024-06-17T21:52:24Z

Commit Message: Make TrieLookupTable destruction non-recursive
Additional Description: Apparently someone might feed a 20000-character long key into this trie, for example in a fuzz test, but never a 40000-character string because that would be ridiculous, so I have to do this or we'd have to revert back to the version that was worse in every other way, because that absolutely makes sense, and we definitely should not have someone who is in charge of the ridiculous test fix it.

Signed-off-by: Raven Black <ravenblack@dropbox.com>

adisuissa

Thanks!
Left one a high-level question.

source/common/common/utility.h

adisuissa · 2024-06-18T09:44:34Z

/assign @jmarantz

botengyao

Thanks for the quick improvement!

source/common/common/utility.h

botengyao · 2024-06-18T13:54:38Z

source/common/common/utility.h

+    for (uint32_t i = 0; i < flat_tree.size(); i++) {
+      for (std::unique_ptr<TrieNode>& child : flat_tree[i]->children_) {
+        if (child != nullptr) {
+          flat_tree.push_back(std::move(child));


Consider maintaining an additional total node counter during add, and then reserve the flat_tree vector to avoid multiple reallocation here?

I considered doing a reserve, didn't seem worth it since the destruction of these things is infrequent. The increased complexity is not worth the performance saving IMO. If someone else wants to make that change later feel free, but I didn't even want to make this change at all, I certainly don't want to make it more complicated than it needs to be to resolve someone else's weird test scenario.

sg, and agreed this is kind of edge cases that can cause problems and pre-existing, but it not weird and a key can be large so we want to prevent it as much as possible. thanks for the improvement!

maybe just drop a TODO to do a counting pass for reserving if this shows up in a flame-graph in the future?

actually you can do this destruction explicitly with a stack whose size will be proportional to the depth of the recursion, rather than a vector whose size will be that of the whole tree. That would be a better TODO than just precalculating the amount to reserve.

jmarantz

I'll wait till Boteng reviews before doing a pass

@ravenblackx I think the original code was buggy also; we shouldn't be recursing through so many levels. Envoy might need to run in an environment with smaller stacks than our test machines as well, so it's good we are sorting this out.

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2024-06-18T14:36:02Z

@ravenblackx I think the original code was buggy also; we shouldn't be recursing through so many levels. Envoy might need to run in an environment with smaller stacks than our test machines as well, so it's good we are sorting this out.

Yes, that's why I'm disgruntled for this to be operating on a "fix our external test or we'll revert your change even though the actual problem was pre-existing" basis. Feels like a certain company is exerting undue power here - if someone else's PR broke one of my external-to-envoy tests based on a problem the PR didn't introduce, I'm pretty sure I wouldn't be able to threaten to revert their PR if they don't immediately address a pre-existing problem.

botengyao

lgtm, defer to @jmarantz for another pass.

botengyao · 2024-06-18T15:03:17Z

source/common/common/utility.h

+    for (uint32_t i = 0; i < flat_tree.size(); i++) {
+      for (std::unique_ptr<TrieNode>& child : flat_tree[i]->children_) {
+        if (child != nullptr) {
+          flat_tree.push_back(std::move(child));


sg, and agreed this is kind of edge cases that can cause problems and pre-existing, but it not weird and a key can be large so we want to prevent it as much as possible. thanks for the improvement!

jmarantz · 2024-06-20T15:19:11Z

source/common/common/utility.h

+    for (uint32_t i = 0; i < flat_tree.size(); i++) {
+      for (std::unique_ptr<TrieNode>& child : flat_tree[i]->children_) {
+        if (child != nullptr) {
+          flat_tree.push_back(std::move(child));


maybe just drop a TODO to do a counting pass for reserving if this shows up in a flame-graph in the future?

jmarantz · 2024-06-20T15:24:22Z

@RyanTheOptimist for senior approval

RyanTheOptimist

Thanks for fixing this!

RyanTheOptimist · 2024-06-20T16:27:04Z

source/common/common/utility.h

@@ -709,7 +707,7 @@ template <class Value> class TrieLookupTable {
      children_[child_index - min_child_] = std::move(node);
    }

-  private:
+    Value value_{};


According to the style guide, members should be private. Can we use friend or accessors to expose the innards?

Made it a struct instead. Seems daft to have a private type need to friend its parent.

According the to style guide, structs should be passive objects which simply carry data. They should not enforce invariants, for example. I think this is definitely a class, so friend or accessors would work nicely. (I wonder if instead of exposing the innards to the parent, if we could perhaps do more work in this class itself. But that's likely outside the scope of this PR.)

https://google.github.io/styleguide/cppguide.html#Structs_vs._Classes

RyanTheOptimist · 2024-06-20T16:40:11Z

source/common/common/utility.h

+        if (child != nullptr) {
+          flat_tree.push_back(std::move(child));
+        }
+      }


To make sure I understand, once we finish moving flat_tree[i]->children_ into flat_tree we are finished with flat_tree[i], right? At that point we could nuke flat_tree[i]. As such it seems like flat_tree is conceptually split into two parts:

the head of the tree which contains nodes that can be deleted.

the tail of the tree with contains nodes that need to be processed.

Is that right? If so, I think I would suggest a slightly different approach. Instead of putting every node into a new std::vector flat_tree what if we instead used something like std::deque nodes_to_process. Then we could do something like:

// To avoid stack overflow on recursive destruction if the tree is very deep, // destroy the element iteratively. std::deque<std::unique_ptr<TrieNode>> nodes_to_process; for (std::unique_ptr<TrieNode>& child : root_.children_) { if (child != nullptr) { nodes_to_process.push_back(std::move(child)); } } while (!nodes_to_process.empty()) { std::unique_ptr<TrieNode> node = nodes_to_process.pop_front(); for (std::unique_ptr<TrieNode>& child : node) { if (child != nullptr) { nodes_to_process.push_back(std::move(child)); } } }

I think this is both a bit more clear and a bit more efficient.

+1 for this.

A deque would perform significantly worse than a vector, though it would save on flattening the entire tree. Went with jmarantz's superior suggestion of a stack - since we don't really care about the order of deletion, consuming from the back provides all the advantages of the deque without giving up the performance advantage of the vector.

Why would a deque perform significantly worse than a vector? It uses contiguous memory so it should be very performant? That being said, a stack is also fine!

std::deque is surprisingly poor in performance in my distant-past experience, at least in some dimensions. YMMV and my measurements and impressions are more than 10 years old so maybe not to be trusted :)

Having said that, a stack is constructed using a deque by default, so I was thinking of using a vector, but using it as a stack :)

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2024-06-21T20:42:19Z

Maybe prefer #34849 instead since it fixes the stack overflow destructor and further improves performance.

jmarantz · 2024-06-24T12:28:21Z

That one looks good; should we submit this first though while the other gets reviewed? Or close this PR?

ravenblackx · 2024-06-24T14:16:58Z

We should close this PR if we can get agreement that that one is preferred.

ravenblackx · 2024-06-24T15:44:06Z

Actually, yeah, let's just close this one, because I don't want to make friend accessors for members of private internal data structures, so this one is stalled anyway. The other version doesn't have that problem, and is simpler, and is more performant, and uses less RAM, and cleans up the dependency mess a bit, so I can't think of any valid reason for this PR to be preferred.

Commit Message: Another refactor of TrieLookupTable Additional Description: The previous refactor magnified the pre-existing problem of a recursive destructor. PR #34783 is being annoying in review, so here's an alternative that simplifies it, putting it in a flat structure while the trie is being filled, so there's no need for a special destructor, and also further improves performance. In addition to changing the implementation again, this PR also moves `TrieLookupTable` out of `utility.h` and into its own separate header and library, since it's used in relatively few places. Risk Level: Small risk, testing is pretty thorough. Testing: Passes all the existing tests, plus the additional new test for not inducing a stack overflow. Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Benchmarks (note that RequestHeaders and ResponseHeaders are no longer actually relevant since the static headers no longer use this trie.) `bazel run -c opt test/common/common:trie_lookup_table_speed_test -- --benchmark_repetitions=10 --benchmark_out_format=csv --benchmark_out=somefile.csv` Benchmark | MeanBefore | MeanAfter | Difference (>100% means before was faster) -- | -- | -- | -- bmTrieLookupsRequestHeaders | 45.5528ns | 50.5076ns | 111% bmTrieLookupsResponseHeaders | 41.4313ns | 47.0212ns | 113% bmTrieLookups/10/Mixed | 303.578ns | 253.114ns | 83% bmTrieLookups/100/Mixed | 351.339ns | 260.583ns | 74% bmTrieLookups/1000/Mixed | 576.878ns | 334.986ns | 58% bmTrieLookups/10000/Mixed | 768.84ns | 518.188ns | 67% bmTrieLookups/10/8 | 12.7108ns | 13.8114ns | 109% bmTrieLookups/100/8 | 17.8588ns | 14.103ns | 79% bmTrieLookups/1000/8 | 29.3718ns | 24.4657ns | 83% bmTrieLookups/10000/8 | 34.1951ns | 30.2386ns | 88% bmTrieLookups/10/128 | 1018.77ns | 467.067ns | 46% bmTrieLookups/100/128 | 749.413ns | 507.53ns | 68% bmTrieLookups/1000/128 | 1083.53ns | 658.86ns | 61% bmTrieLookups/10000/128 | 1319.94ns | 848.408ns | 64% --------- Signed-off-by: Raven Black <ravenblack@dropbox.com>

Make TrieLookupTable destruction non-recursive

a15b889

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx assigned adisuissa and botengyao Jun 17, 2024

adisuissa reviewed Jun 18, 2024

View reviewed changes

source/common/common/utility.h Show resolved Hide resolved

source/common/common/utility.h Outdated Show resolved Hide resolved

repokitteh-read-only bot assigned jmarantz Jun 18, 2024

botengyao reviewed Jun 18, 2024

View reviewed changes

jmarantz reviewed Jun 18, 2024

View reviewed changes

Move value_

b70519c

Signed-off-by: Raven Black <ravenblack@dropbox.com>

botengyao reviewed Jun 18, 2024

View reviewed changes

jmarantz previously approved these changes Jun 20, 2024

View reviewed changes

jmarantz assigned RyanTheOptimist Jun 20, 2024

RyanTheOptimist reviewed Jun 20, 2024

View reviewed changes

Struct and stack

8a926f6

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx dismissed jmarantz’s stale review via 8a926f6 June 20, 2024 20:23

ravenblackx added 2 commits June 20, 2024 20:35

Naming

cbb9d20

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Don't use a deque implementation as is the stupid default

336215c

Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz approved these changes Jun 21, 2024

View reviewed changes

ravenblackx mentioned this pull request Jun 21, 2024

Another refactor of TrieLookupTable #34849

Merged

ravenblackx closed this Jun 24, 2024

ravenblackx deleted the node_depth branch June 24, 2024 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make TrieLookupTable destruction non-recursive #34783

Make TrieLookupTable destruction non-recursive #34783

ravenblackx commented Jun 17, 2024

adisuissa left a comment

adisuissa commented Jun 18, 2024

botengyao left a comment

botengyao Jun 18, 2024

ravenblackx Jun 18, 2024

botengyao Jun 18, 2024 •

edited

Loading

jmarantz Jun 20, 2024

jmarantz Jun 20, 2024

jmarantz left a comment

ravenblackx commented Jun 18, 2024

botengyao left a comment

botengyao Jun 18, 2024 •

edited

Loading

jmarantz Jun 20, 2024

jmarantz commented Jun 20, 2024

RyanTheOptimist left a comment

RyanTheOptimist Jun 20, 2024

ravenblackx Jun 20, 2024

RyanTheOptimist Jun 20, 2024

RyanTheOptimist Jun 20, 2024

botengyao Jun 20, 2024

ravenblackx Jun 20, 2024 •

edited

Loading

RyanTheOptimist Jun 20, 2024

jmarantz Jun 21, 2024

ravenblackx commented Jun 21, 2024

jmarantz commented Jun 24, 2024

ravenblackx commented Jun 24, 2024

ravenblackx commented Jun 24, 2024

Make TrieLookupTable destruction non-recursive #34783

Make TrieLookupTable destruction non-recursive #34783

Conversation

ravenblackx commented Jun 17, 2024

adisuissa left a comment

Choose a reason for hiding this comment

adisuissa commented Jun 18, 2024

botengyao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

botengyao Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

ravenblackx commented Jun 18, 2024

botengyao left a comment

Choose a reason for hiding this comment

botengyao Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz commented Jun 20, 2024

RyanTheOptimist left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravenblackx Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravenblackx commented Jun 21, 2024

jmarantz commented Jun 24, 2024

ravenblackx commented Jun 24, 2024

ravenblackx commented Jun 24, 2024

botengyao Jun 18, 2024 •

edited

Loading

botengyao Jun 18, 2024 •

edited

Loading

ravenblackx Jun 20, 2024 •

edited

Loading