Fix slow saving of latest witnesses #11354

jancionear · 2024-05-20T21:50:16Z

Changes in this PR:

Improved observability of saving latest witnesses
- Added metrics
- Added a tracing span, which will be visible in span analysis tools
- Added a printout in the logs with details about saving the latest witness
Fixed the extreme slowness of save_latest_chunk_state_witness, the new solution doesn't iterate anything
Start saving witnesses produced during shadow validation, I needed that to properly test the change

The previous solution used store().iter() to find the witness with the lowest height that needs to be removed to free up space, but it turned out that this takes a really long time, ~100ms!

The new solution doesn't iterate anything, instead of that it maintains a mapping from integer indexes to saved witnesses.
So the first observed witness gets index 0, the second one gets 1, third gets 2, and so on...
When it's time to free up space we delete the witness with the lowest index.

We maintain two pointers to the ends of this "queue", and move them accordingly when the witnesses are removed and added.

This greatly improves the time needed to save the latest witness - with new code generating the database update usually takes under 1ms, and commiting it takes under 6ms (on shadow validation):

(view the metrics here)

~7ms is still a non-negligible amount of time, but it's way better than the previous ~100ms. It's a debug only feature, so 7ms might be acceptable.

codecov · 2024-05-21T04:08:26Z

Codecov Report

Attention: Patch coverage is 2.08333% with 94 lines in your changes are missing coverage. Please review.

Project coverage is 71.17%. Comparing base (9063a2c) to head (8dc17c6).
Report is 7 commits behind head on master.

Files	Patch %	Lines
chain/chain/src/store/latest_witnesses.rs	0.00%	56 Missing ⚠️
chain/chain/src/metrics.rs	0.00%	32 Missing ⚠️
chain/chain/src/garbage_collection.rs	0.00%	3 Missing ⚠️
...client/src/stateless_validation/shadow_validate.rs	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11354      +/-   ##
==========================================
+ Coverage   71.08%   71.17%   +0.09%     
==========================================
  Files         783      784       +1     
  Lines      156833   157305     +472     
  Branches   156833   157305     +472     
==========================================
+ Hits       111477   111969     +492     
+ Misses      40528    40493      -35     
- Partials     4828     4843      +15

Flag	Coverage Δ
backward-compatibility	`0.24% <0.00%> (-0.01%)`	⬇️
db-migration	`0.24% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.38% <0.00%> (-0.01%)`	⬇️
integration-tests	`37.18% <1.04%> (+0.11%)`	⬆️
linux	`68.68% <2.08%> (-0.12%)`	⬇️
linux-nightly	`70.62% <2.08%> (+0.10%)`	⬆️
macos	`52.22% <1.04%> (+<0.01%)`	⬆️
pytests	`1.60% <0.00%> (-0.01%)`	⬇️
sanity-checks	`1.40% <0.00%> (-0.01%)`	⬇️
unittests	`65.56% <1.04%> (+0.06%)`	⬆️
upgradability	`0.29% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

robin-near · 2024-05-21T19:13:24Z

chain/chain/src/garbage_collection.rs

@@ -1040,6 +1040,9 @@ impl<'a> ChainStoreUpdate<'a> {
            DBCol::LatestChunkStateWitnesses => {
                store_update.delete(col, key);
            }
+            DBCol::LatesWitnessesByIndex => {


typo: Latest

robin-near · 2024-05-21T19:14:49Z

chain/chain/src/store/latest_witnesses.rs

@@ -45,21 +47,22 @@ impl LatestWitnessesKey {
    /// `LatestWitnessesKey` has custom serialization to ensure that the binary representation
    /// starts with big-endian height and shard_id.
    /// This allows to query using a key prefix to find all witnesses for a given height (and shard_id).
-    pub fn serialized(&self) -> [u8; 64] {
-        let mut result = [0u8; 64];
+    pub fn serialized(&self) -> [u8; 72] {


If we're no longer using range iteration, does it matter what encoding we're using? Should we instead use borsh?

The encoding doesn't matter for the write path now, but it still matters for the read path. When someone queries for saved witnesses at a given height we can quickly find them using the prefix.

./neard view-state latest-witnesses --height 119372230 --shard-id 0 # create prefix (119372230, 0) and query `DBCol::LatestChunkStateWitnesses` with it

Ahhh, gotcha. That makes sense. Thanks.

robin-near · 2024-05-21T19:20:22Z

chain/chain/src/store/latest_witnesses.rs

-            let (key_bytes, witness_bytes) = item?;
-            store_update.delete(DBCol::LatestChunkStateWitnesses, &key_bytes);
+        // Go over witnesses with increasing indexes and remove them until the limits are satisfied.
+        while !info.is_within_limits() && info.lowest_index < info.next_witness_index {


What's the maximum number of old witnesses evicted this way? Is it (max witness size / min witness size)? If there are a bunch of small witnesses followed by a big witness, would that mean we have to look up quite many witnesses when processing the large one? Should we place an upper limit on how many times this loop iterates, just to not block the main processing path?

Speaking of which, I wonder if it makes sense for this whole thing to be on a separate thread (as in, a separate actor). I guess it's a bit overkill but the fewer things on the main thread the better.

Oh wait I see that this is debug-only and not recommended for production use. Okay, maybe this is not a big deal then.

Yeah it could potentially take a long time if it needs to remove a lot of saved witnesses :/
In the average case it only removes one or two, but in the worst case scenario it could need to remove thousands of witnesses.

It's a debug-only optional feature, and I'm not sure if anyone is really using it, so I didn't spend too much time optimizing things. We can fix it in a follow up PR if it becomes a problem.

Fixes: near#11258 Changes in this PR: * Improved observability of saving latest witnesses * Added metrics * Added a tracing span, which will be visible in span analysis tools * Added a printout in the logs with details about saving the latest witness * Fixed the extreme slowness of `save_latest_chunk_state_witness`, the new solution doesn't iterate anything * Start saving witnesses produced during shadow validation, I needed that to properly test the change The previous solution used `store().iter()` to find the witness with the lowest height that needs to be removed to free up space, but it turned out that this takes a really long time, ~100ms! The new solution doesn't iterate anything, instead of that it maintains a mapping from integer indexes to saved witnesses. So the first observed witness gets index 0, the second one gets 1, third gets 2, and so on... When it's time to free up space we delete the witness with the lowest index. We maintain two pointers to the ends of this "queue", and move them accordingly when the witnesses are removed and added. This greatly improves the time needed to save the latest witness - with new code generating the database update usually takes under 1ms, and commiting it takes under 6ms (on shadow validation): ![image](https://github.com/near/nearcore/assets/149345204/06f379d3-1a36-4aa0-8c5f-043bab7bc36c) ([view the metrics here](https://nearone.grafana.net/d/admakiv9pst8gd/save-latest-witnesses-stats?orgId=1&var-chain_id=mainnet&var-node_id=jan-mainnet-node&var-shard_id=All&from=1716234291000&to=1716241491000)) ~7ms is still a non-negligible amount of time, but it's way better than the previous ~100ms. It's a debug only feature, so 7ms might be acceptable.

jancionear added 3 commits May 20, 2024 19:19

Add observability to save_latest_chunk_state_witness

76211f4

Fix slow saving of latest witnesses

f9a8c17

Save latest witness when using shadow validation

e143e2e

jancionear added the A-stateless-validation Area: stateless validation label May 20, 2024

jancionear requested a review from robin-near May 20, 2024 21:50

jancionear requested a review from a team as a code owner May 20, 2024 21:50

robin-near reviewed May 21, 2024

View reviewed changes

robin-near approved these changes May 21, 2024

View reviewed changes

Fix typo

8dc17c6

jancionear added this pull request to the merge queue May 22, 2024

Merged via the queue into near:master with commit c515b1c May 22, 2024
26 of 29 checks passed

jancionear deleted the fix-latest-slow branch May 22, 2024 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix slow saving of latest witnesses #11354

Fix slow saving of latest witnesses #11354

jancionear commented May 20, 2024

codecov bot commented May 21, 2024 •

edited

Loading

robin-near May 21, 2024

robin-near May 21, 2024

jancionear May 22, 2024

robin-near May 22, 2024

robin-near May 21, 2024

robin-near May 21, 2024

robin-near May 21, 2024

jancionear May 22, 2024

Fix slow saving of latest witnesses #11354

Fix slow saving of latest witnesses #11354

Conversation

jancionear commented May 20, 2024

codecov bot commented May 21, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented May 21, 2024 •

edited

Loading