Reflection-based deterministic message hashing #30761

ravenblackx · 2023-11-07T18:24:12Z

Commit Message: Reflection-based deterministic message hashing
Additional Description: Comparing config used to be done with "deterministic" serialization and a hash of that, but it turned out deterministic serialization was not in fact deterministic enough (Any messages could contain reordered maps or even other fields). The recommendation in protobuf docs is "if you want a deterministic order do it yourself". (Note that the linked comment also suggests that SetSerializationDeterministic should work for our purposes here, but protocolbuffers/protobuf#5731 is the relevant unaddressed bug about Any fields.)

The prior method of achieving determinism by expanding Any via TextFormat serialization is slow enough that it shows up on performance graphs as a significant cost (about 1/4 of our total startup time).

This change is quite a lot of code, but should give us a deterministic hash with a much faster runtime than transforming to and then hashing a human-readable string.

Risk Level: Some; if it's broken it might cause config updates to not apply, or cache entries to collide.
Testing: Many unit tests, and a bit extra in the existing hash test. Benchmark results from bazel run -c opt test/common/protobuf:utility_speed_test:

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
bmHashByDeterministicHash/map                 35663 ns        35662 ns        20288
bmHashByTextFormat/map                       376608 ns       376503 ns         1892
bmHashByDeterministicHash/recursion           13682 ns        13681 ns        51381
bmHashByTextFormat/recursion                  30239 ns        30239 ns        23463
bmHashByDeterministicHash/repeatedFields      21024 ns        21023 ns        33263
bmHashByTextFormat/repeatedFields            150444 ns       150443 ns         4645

Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a
Runtime Guard: envoy.restart_features.use_fast_protobuf_hash

Signed-off-by: Raven Black <ravenblack@dropbox.com>

…rently Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz · 2023-11-10T15:02:57Z

source/common/protobuf/deterministic_hash.cc

+    hash_type(reflection->Get##get_type(message, field));                                          \
+  }
+
+#define MAP_SORT_BY(get_type)                                                                      \


can you pass in the variable name rather than capturing the free variable, ie MAP_SORT_BY(map, get_type) ?

Also the map variable at call-sites is really a vector. Call it that? I'm actually not sure we really need to macro-izer the call to std::sort. I think you could do this readably without macros by instead just making a template function that returns the comparator:

std::sort(map.begin(), map.end(), makeCompareFn<get_type>);

I think that will be more readaable at call-sits and only slight more verbose.

I don't think you can template e.g. makeCompareFn<int> to reflection->GetInt(...) can you? Other than by explicitly writing a specialization for each of the possible values. If we want to go that way I might as well just manually fully expand the macro throughout.
(It is frustrating that the reflection API doesn't have a Get template with specializations that would make this easy!)

I've now done it with a middle-ground template which to me has the worst readability of all possible options. Do you prefer this over the full expansion?

I see how that got awkward fast :) Let me reflect on that (no pun intended).

Restructured it without that template, it's less lines of code.

Then separated the sorting out into its own helper class, then made the comparison function also its own helper class so there's no std::function or even lambdas involved any more. Then got rid of the original helper class because the comparer is enough to make it more suitable to just be a function again. :)

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2023-11-14T16:01:59Z

/retest
(mysql integration test passes locally and does not appear to be related)

ravenblackx · 2023-11-14T16:02:49Z

/retest

ravenblackx · 2023-11-14T16:34:08Z

/coverage

repokitteh-read-only · 2023-11-14T16:34:13Z

Coverage for this Pull Request will be rendered here:

https://storage.googleapis.com/envoy-pr/30761/coverage/index.html

The coverage results are (re-)rendered each time the CI envoy-presubmit (check linux_x64 coverage) job completes.

🐱

Caused by: a #30761 (comment) was created by @ravenblackx.

see: more, trace.

Signed-off-by: Raven Black <ravenblack@dropbox.com>

The reduction is necessary because the new uncovered lines are unreachable enum paths that cannot be tested, and the existing covered function had its LoC reduced which also contributed to a coverage percentage downswing. Signed-off-by: Raven Black <ravenblack@dropbox.com>

repokitteh-read-only · 2023-11-14T18:59:03Z

CC @envoyproxy/coverage-shephards: FYI only for changes made to (test/per_file_coverage.sh).
envoyproxy/coverage-shephards assignee is @RyanTheOptimist

🐱

Caused by: #30761 was synchronize by ravenblackx.

see: more, trace.

RyanTheOptimist · 2023-11-14T19:04:22Z

test/per_file_coverage.sh

@@ -15,7 +15,7 @@ declare -a KNOWN_LOW_COVERAGE=(
 "source/common/matcher:94.6"
 "source/common/network:94.4" # Flaky, `activateFileEvents`, `startSecureTransport` and `ioctl`, listener_socket do not always report LCOV
 "source/common/network/dns_resolver:91.4"  # A few lines of MacOS code not tested in linux scripts. Tested in MacOS scripts
-"source/common/protobuf:96.5"
+"source/common/protobuf:96.3"


Can we find a way to keep the existing coverage threshold for this directory?

Not a relevant one. Literally every new line of code that is reachable is covered. We could add some nonsense padding lines in covered areas, or add a test to something completely unrelated to this change, would be the only feasible ways to re-increase the coverage percentage.

OR, an alternative I considered, give the helper class a separate constructor to allow to construct it with invalid input fields, expose that constructor only for testing (but it would still be built for production) and use that to add test coverage of the unreachable lines by making them reachable. Pretty sure that's even worse than the other options.

Fair enough. Coverage LGTM

Ended up actually improving the coverage, as the final version of deterministic_hash.cc now has 100% coverage (the switch with unreachable paths was excised thanks to jmarantz pointing out that we don't actually have to sort a list in order to hash it ignoring order), and the intermediate/current version of utility.cc still has the old lines covered for the lifespan of a runtime guard.

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2023-12-05T20:53:19Z

Updated to use xxHash64Value, added an explicit scalar constraint on hashScalarField (even though it would be implicit from xxHash64Value, it makes sense to be explicit too), and refreshed the benchmarks in the PR description after the change.

ravenblackx · 2023-12-07T18:37:19Z

/retest

jmarantz · 2023-12-07T22:50:26Z

changelogs/current.yaml

+    The performance of the hash operation is improved by 2-10x depending on the structure of the message,
+    which is expected to reduce config update time or startup time by 10-25%. The new algorithm is also
+    used for http_cache_filter hashing, which will effectively cause a one-time cache flush on update
+    for users with a persistent cache. To revert this behavior set ``envoy.restart_features.use_fast_protobuf_hash`` to false.


how would you feel about making this default to the current behavior.

I would like, when we import this into our mono-repo, to have products run the new mode in staging for a few weeks first before we turn it on.

I would prefer not to. My understanding is that runtime guards for performance improvements are generally supposed to be emergency shutoffs, not optional activates - if you make it opt-in then it doesn't get tested.

Not forever. Just for some amount of time (a quarter?)

@alyssawilk may have more context on an appropriate amount of time a large change should bake before flipping it to the default. It will improve things; it will get tested and turned on.

I just don't want the eventual productionizing of this to catch us by surprise, and we wouldn't be able to disable it prior to importing it.

False by default it is.

generally when we do false by default we assign an owner to hassle when we flip true. Josh as you're looking at testing this are you OK driving the default flip? Alternately if this is no higher risk than most flags should we leave it as true by default upstream and import-false? I'm fine either way, just want to make sure we have a plan on how to move forward

I think I don't need to take that on -- I like this change a lot, but it's solving a problem our service is not currently suffering from, afaict. Even if we have like a delay till end of January to make sure we have a chance to flip the flag off, I'd be OK.

I think for the service my team runs, we are being particularly paranoid right now, and we'd just want a chance to have this flag imported into our monorepo first, and then turn it off, then it could be turned on in Envoy main. After N months we could flip it on ourselves; we just can't take on that risk right now.

test/common/protobuf/utility_speed_test.cc

Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz

LGTM but as this is pretty deep stuff I think I'd like to see Adi and Yan both review.

jmarantz · 2023-12-11T03:09:54Z

test/extensions/filters/http/cache/http_cache_test.cc

@@ -296,7 +296,7 @@ TEST_F(LookupRequestTest, PragmaNoFallback) {
 TEST(HttpCacheTest, StableHashKey) {
  Key key;
  key.set_host("example.com");
-  ASSERT_EQ(stableHashKey(key), 9582653837550152292u);
+  ASSERT_EQ(stableHashKey(key), 6153940628716543519u);


Why ASSERT_EQ rather than EXPECT_EQ? Usually I use ASSERT_EQ only if, when it fails, that might result in a crash later in the function, like

ASSERT_EQ(5, array.size()); EXPECT_EQ("foo", array[4]);

no big deal though

jmarantz

Oh I see Ryan is already the senior maintainer on it but I'd still like Adi to review first.

adisuissa

Overall LGTM, thanks!
Left a few minor comments.

changelogs/current.yaml

source/common/protobuf/deterministic_hash.cc

source/common/runtime/runtime_features.cc

source/common/protobuf/deterministic_hash.cc

Signed-off-by: Raven Black <ravenblack@dropbox.com>

adisuissa

Thanks for fixing this!
One small minor comment, otherwise LGTM

source/common/protobuf/deterministic_hash.cc

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2023-12-11T22:28:16Z

/retest

jmarantz · 2023-12-12T15:11:30Z

I think this just need Senior Maintainer approval. Ryan?

RyanTheOptimist

Looks great!

RyanTheOptimist · 2023-12-12T15:20:20Z

test/common/protobuf/deterministic_hash_test.cc

+  value.set_index(2);
+  a2.mutable_any()->PackFrom(value);
+  EXPECT_NE(hash(a1), hash(a2));
+}


Excellent tests!

ravenblackx · 2023-12-12T15:37:36Z

/retest

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx · 2023-12-12T16:47:21Z

/retest

ravenblackx · 2023-12-13T13:42:14Z

/retest

ravenblackx added 5 commits November 7, 2023 18:08

Reflection-based deterministic message hashing

9c55848

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Add comment

e9ca067

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Use GetStringReference

44f7ebe

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Spelling

465d833

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Debug

0bfb6f1

Signed-off-by: Raven Black <ravenblack@dropbox.com>

mattklein123 assigned jmarantz Nov 8, 2023

ravenblackx added 2 commits November 9, 2023 16:42

Empty message hash must be nonzero

ef99f1e

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Make http_cache use deterministic hash

11aba21

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx requested a review from jmarantz as a code owner November 9, 2023 17:54

Messages of different types with identical contents should hash diffe…

a5b4a89

…rently Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz reviewed Nov 10, 2023

View reviewed changes

ravenblackx added 9 commits November 10, 2023 15:57

Template map sorting

6c7d090

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Fewer macros

29532f6

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Type-typo

aa9e5fd

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Add some tests, fix a bug they found

9e38e75

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Last of the tests and another fix

edcf363

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Untemplate

9b67119

Signed-off-by: Raven Black <ravenblack@dropbox.com>

No lambdas version

c172e2d

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Remove nested struct

c686b3a

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Satisfy Windows inferior "no return value" checks

5b4e623

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx added 2 commits November 14, 2023 16:52

Panic > bug for invalid enum

cd2d3c1

Signed-off-by: Raven Black <ravenblack@dropbox.com>

repokitteh-read-only bot assigned RyanTheOptimist Nov 14, 2023

RyanTheOptimist reviewed Nov 14, 2023

View reviewed changes

Remove one uncovered line :)

b6e84b0

Signed-off-by: Raven Black <ravenblack@dropbox.com>

ravenblackx added 2 commits December 5, 2023 20:07

Merge branch 'main' into hash

40cefb2

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Use xxHash64Value

fffa18c

Signed-off-by: Raven Black <ravenblack@dropbox.com>

repokitteh-read-only bot removed the waiting label Dec 5, 2023

ravenblackx requested a review from jmarantz December 5, 2023 20:57

jmarantz reviewed Dec 7, 2023

View reviewed changes

Runtime guard defaulting to false

4255b62

Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz reviewed Dec 11, 2023

View reviewed changes

jmarantz assigned adisuissa Dec 11, 2023

jmarantz reviewed Dec 11, 2023

View reviewed changes

adisuissa reviewed Dec 11, 2023

View reviewed changes

ravenblackx mentioned this pull request Dec 11, 2023

Hash protobufs with custom reflection rather than TextFormat #31276

Open

Adi's comments

23c6aa9

Signed-off-by: Raven Black <ravenblack@dropbox.com>

adisuissa reviewed Dec 11, 2023

View reviewed changes

source/common/protobuf/deterministic_hash.cc Show resolved Hide resolved

ravenblackx added 3 commits December 11, 2023 17:40

Add ASSERT for GetPrototype

7feee62

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Add tests for Any and repeated message

f6a44ef

Signed-off-by: Raven Black <ravenblack@dropbox.com>

Merge branch 'main' into hash

4455b88

Signed-off-by: Raven Black <ravenblack@dropbox.com>

RyanTheOptimist approved these changes Dec 12, 2023

View reviewed changes

Merge branch 'main' into hash

fbed541

Signed-off-by: Raven Black <ravenblack@dropbox.com>

jmarantz mentioned this pull request Dec 13, 2023

flake: IpVersionsClientType/UpstreamNetworkExtensionDiscoveryIntegrationTest.ConfigDumpWithFilterConfigRemovedByTtl/3 is flaky #31340

Closed

phlax merged commit 1fbc9e5 into envoyproxy:main Dec 13, 2023
55 of 58 checks passed

ravenblackx deleted the hash branch December 13, 2023 15:19

jmarantz mentioned this pull request Mar 5, 2024

Replace Equivalent call with Equals in proto comparisons #32694

Merged

sschepens mentioned this pull request Mar 25, 2024

Enable vtprotobuf istio/istio#49790

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reflection-based deterministic message hashing #30761

Reflection-based deterministic message hashing #30761

ravenblackx commented Nov 7, 2023 •

edited

jmarantz Nov 10, 2023

ravenblackx Nov 10, 2023

jmarantz Nov 10, 2023

ravenblackx Nov 13, 2023

ravenblackx commented Nov 14, 2023

ravenblackx commented Nov 14, 2023

ravenblackx commented Nov 14, 2023

repokitteh-read-only bot commented Nov 14, 2023

repokitteh-read-only bot commented Nov 14, 2023

RyanTheOptimist Nov 14, 2023

ravenblackx Nov 14, 2023

RyanTheOptimist Nov 15, 2023

ravenblackx Dec 5, 2023

ravenblackx commented Dec 5, 2023

ravenblackx commented Dec 7, 2023

jmarantz Dec 7, 2023

ravenblackx Dec 8, 2023

jmarantz Dec 8, 2023

ravenblackx Dec 8, 2023

alyssawilk Dec 11, 2023

jmarantz Dec 11, 2023

jmarantz left a comment

jmarantz Dec 11, 2023

jmarantz left a comment

adisuissa left a comment

adisuissa left a comment

ravenblackx commented Dec 11, 2023

jmarantz commented Dec 12, 2023

RyanTheOptimist left a comment

RyanTheOptimist Dec 12, 2023

ravenblackx commented Dec 12, 2023

ravenblackx commented Dec 12, 2023

ravenblackx commented Dec 13, 2023

Reflection-based deterministic message hashing #30761

Reflection-based deterministic message hashing #30761

Conversation

ravenblackx commented Nov 7, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravenblackx commented Nov 14, 2023

ravenblackx commented Nov 14, 2023

ravenblackx commented Nov 14, 2023

repokitteh-read-only bot commented Nov 14, 2023

repokitteh-read-only bot commented Nov 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravenblackx commented Dec 5, 2023

ravenblackx commented Dec 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

adisuissa left a comment

Choose a reason for hiding this comment

adisuissa left a comment

Choose a reason for hiding this comment

ravenblackx commented Dec 11, 2023

jmarantz commented Dec 12, 2023

RyanTheOptimist left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravenblackx commented Dec 12, 2023

ravenblackx commented Dec 12, 2023

ravenblackx commented Dec 13, 2023

ravenblackx commented Nov 7, 2023 •

edited