Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slow saving of latest witnesses #11354

Merged
merged 4 commits into from
May 22, 2024
Merged

Conversation

jancionear
Copy link
Contributor

Fixes: #11258

Changes in this PR:

  • Improved observability of saving latest witnesses
    • Added metrics
    • Added a tracing span, which will be visible in span analysis tools
    • Added a printout in the logs with details about saving the latest witness
  • Fixed the extreme slowness of save_latest_chunk_state_witness, the new solution doesn't iterate anything
  • Start saving witnesses produced during shadow validation, I needed that to properly test the change

The previous solution used store().iter() to find the witness with the lowest height that needs to be removed to free up space, but it turned out that this takes a really long time, ~100ms!

The new solution doesn't iterate anything, instead of that it maintains a mapping from integer indexes to saved witnesses.
So the first observed witness gets index 0, the second one gets 1, third gets 2, and so on...
When it's time to free up space we delete the witness with the lowest index.

We maintain two pointers to the ends of this "queue", and move them accordingly when the witnesses are removed and added.

This greatly improves the time needed to save the latest witness - with new code generating the database update usually takes under 1ms, and commiting it takes under 6ms (on shadow validation):

image

(view the metrics here)

~7ms is still a non-negligible amount of time, but it's way better than the previous ~100ms. It's a debug only feature, so 7ms might be acceptable.

@jancionear jancionear added the A-stateless-validation Area: stateless validation label May 20, 2024
@jancionear jancionear requested a review from a team as a code owner May 20, 2024 21:50
Copy link

codecov bot commented May 21, 2024

Codecov Report

Attention: Patch coverage is 2.08333% with 94 lines in your changes are missing coverage. Please review.

Project coverage is 71.17%. Comparing base (9063a2c) to head (8dc17c6).
Report is 7 commits behind head on master.

Files Patch % Lines
chain/chain/src/store/latest_witnesses.rs 0.00% 56 Missing ⚠️
chain/chain/src/metrics.rs 0.00% 32 Missing ⚠️
chain/chain/src/garbage_collection.rs 0.00% 3 Missing ⚠️
...client/src/stateless_validation/shadow_validate.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11354      +/-   ##
==========================================
+ Coverage   71.08%   71.17%   +0.09%     
==========================================
  Files         783      784       +1     
  Lines      156833   157305     +472     
  Branches   156833   157305     +472     
==========================================
+ Hits       111477   111969     +492     
+ Misses      40528    40493      -35     
- Partials     4828     4843      +15     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.38% <0.00%> (-0.01%) ⬇️
integration-tests 37.18% <1.04%> (+0.11%) ⬆️
linux 68.68% <2.08%> (-0.12%) ⬇️
linux-nightly 70.62% <2.08%> (+0.10%) ⬆️
macos 52.22% <1.04%> (+<0.01%) ⬆️
pytests 1.60% <0.00%> (-0.01%) ⬇️
sanity-checks 1.40% <0.00%> (-0.01%) ⬇️
unittests 65.56% <1.04%> (+0.06%) ⬆️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -1040,6 +1040,9 @@ impl<'a> ChainStoreUpdate<'a> {
DBCol::LatestChunkStateWitnesses => {
store_update.delete(col, key);
}
DBCol::LatesWitnessesByIndex => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Latest

@@ -45,21 +47,22 @@ impl LatestWitnessesKey {
/// `LatestWitnessesKey` has custom serialization to ensure that the binary representation
/// starts with big-endian height and shard_id.
/// This allows to query using a key prefix to find all witnesses for a given height (and shard_id).
pub fn serialized(&self) -> [u8; 64] {
let mut result = [0u8; 64];
pub fn serialized(&self) -> [u8; 72] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're no longer using range iteration, does it matter what encoding we're using? Should we instead use borsh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding doesn't matter for the write path now, but it still matters for the read path. When someone queries for saved witnesses at a given height we can quickly find them using the prefix.

./neard view-state latest-witnesses --height 119372230 --shard-id 0
# create prefix (119372230, 0) and query `DBCol::LatestChunkStateWitnesses` with it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, gotcha. That makes sense. Thanks.

let (key_bytes, witness_bytes) = item?;
store_update.delete(DBCol::LatestChunkStateWitnesses, &key_bytes);
// Go over witnesses with increasing indexes and remove them until the limits are satisfied.
while !info.is_within_limits() && info.lowest_index < info.next_witness_index {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the maximum number of old witnesses evicted this way? Is it (max witness size / min witness size)? If there are a bunch of small witnesses followed by a big witness, would that mean we have to look up quite many witnesses when processing the large one? Should we place an upper limit on how many times this loop iterates, just to not block the main processing path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of which, I wonder if it makes sense for this whole thing to be on a separate thread (as in, a separate actor). I guess it's a bit overkill but the fewer things on the main thread the better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait I see that this is debug-only and not recommended for production use. Okay, maybe this is not a big deal then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it could potentially take a long time if it needs to remove a lot of saved witnesses :/
In the average case it only removes one or two, but in the worst case scenario it could need to remove thousands of witnesses.

It's a debug-only optional feature, and I'm not sure if anyone is really using it, so I didn't spend too much time optimizing things. We can fix it in a follow up PR if it becomes a problem.

@jancionear jancionear added this pull request to the merge queue May 22, 2024
Merged via the queue into near:master with commit c515b1c May 22, 2024
26 of 29 checks passed
@jancionear jancionear deleted the fix-latest-slow branch May 22, 2024 09:36
marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request May 23, 2024
Fixes: near#11258

Changes in this PR:
* Improved observability of saving latest witnesses
    * Added metrics
    * Added a tracing span, which will be visible in span analysis tools
* Added a printout in the logs with details about saving the latest
witness
* Fixed the extreme slowness of `save_latest_chunk_state_witness`, the
new solution doesn't iterate anything
* Start saving witnesses produced during shadow validation, I needed
that to properly test the change

The previous solution used `store().iter()` to find the witness with the
lowest height that needs to be removed to free up space, but it turned
out that this takes a really long time, ~100ms!

The new solution doesn't iterate anything, instead of that it maintains
a mapping from integer indexes to saved witnesses.
So the first observed witness gets index 0, the second one gets 1, third
gets 2, and so on...
When it's time to free up space we delete the witness with the lowest
index.

We maintain two pointers to the ends of this "queue", and move them
accordingly when the witnesses are removed and added.

This greatly improves the time needed to save the latest witness - with
new code generating the database update usually takes under 1ms, and
commiting it takes under 6ms (on shadow validation):


![image](https://github.com/near/nearcore/assets/149345204/06f379d3-1a36-4aa0-8c5f-043bab7bc36c)

([view the metrics
here](https://nearone.grafana.net/d/admakiv9pst8gd/save-latest-witnesses-stats?orgId=1&var-chain_id=mainnet&var-node_id=jan-mainnet-node&var-shard_id=All&from=1716234291000&to=1716241491000))

~7ms is still a non-negligible amount of time, but it's way better than
the previous ~100ms. It's a debug only feature, so 7ms might be
acceptable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix slowdown for LatestChunkStateWitnesses
2 participants