Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Forknet V2] Aggressive database trimmer that works with memtrie. #11420

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

robin-near
Copy link
Contributor

@robin-near robin-near commented May 30, 2024

Closes #11280

A rewrite of #11310

Includes two new commands:

  • neard database aggressive-trimming --obliterate-disk-trie performs the deletion of all State keys except those that are needed for memtrie operation (large values not inlined by flat storage, as well as some trie nodes near the root). This is used to test that memtrie-only nodes can work.
  • neard fork-network finalize now rebuilds the database by copying out FlatState as well as the aforementioned necessary State keys, and then swapping in the new database replacing the old.

Copy link

codecov bot commented May 30, 2024

Codecov Report

Attention: Patch coverage is 14.43299% with 249 lines in your changes missing coverage. Please review.

Project coverage is 71.19%. Comparing base (a4d9742) to head (df53d23).
Report is 42 commits behind head on master.

Files Patch % Lines
tools/database/src/utils.rs 0.00% 140 Missing ⚠️
tools/fork-network/src/trim_database.rs 0.00% 31 Missing ⚠️
core/store/src/trie/mem/parallel_loader.rs 59.70% 27 Missing ⚠️
tools/fork-network/src/cli.rs 0.00% 24 Missing ⚠️
tools/database/src/aggressive_trimming.rs 0.00% 21 Missing and 1 partial ⚠️
core/store/src/lib.rs 0.00% 3 Missing ⚠️
core/store/src/trie/mem/loading.rs 66.66% 0 Missing and 1 partial ⚠️
tools/database/src/commands.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11420      +/-   ##
==========================================
- Coverage   71.44%   71.19%   -0.26%     
==========================================
  Files         785      789       +4     
  Lines      158765   159645     +880     
  Branches   158765   159645     +880     
==========================================
+ Hits       113432   113656     +224     
- Misses      40436    41047     +611     
- Partials     4897     4942      +45     
Flag Coverage Δ
backward-compatibility 0.23% <0.00%> (-0.01%) ⬇️
db-migration 0.23% <0.00%> (-0.01%) ⬇️
genesis-check 1.36% <0.00%> (-0.02%) ⬇️
integration-tests 37.37% <12.37%> (-0.20%) ⬇️
linux 68.64% <14.43%> (-0.29%) ⬇️
linux-nightly 70.71% <14.43%> (-0.22%) ⬇️
macos 52.01% <14.43%> (-0.27%) ⬇️
pytests 1.58% <0.00%> (-0.02%) ⬇️
sanity-checks 1.37% <0.00%> (-0.02%) ⬇️
unittests 65.92% <14.43%> (-0.26%) ⬇️
upgradability 0.28% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@robin-near robin-near changed the title Draft trimmer [Forknet V2] Aggressive database trimmer that works with memtrie. May 31, 2024
@robin-near robin-near force-pushed the delete3 branch 2 times, most recently from 4296581 to a3c0caa Compare May 31, 2024 21:18
@robin-near robin-near marked this pull request as ready for review May 31, 2024 21:21
@robin-near robin-near requested a review from a team as a code owner May 31, 2024 21:21
@robin-near robin-near requested review from tayfunelmas, Longarithm and marcelo-gonzalez and removed request for Longarithm May 31, 2024 21:21
Copy link
Contributor

@marcelo-gonzalez marcelo-gonzalez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this looks pretty sane to me, but sadly doesnt seem to work right now on localnet... @robin-near let me know if you have trouble reproducing w the instructions i posted in that linked comment and i can give more details. I'm not sure if this is just some quirk of localnet pytests that makes something break here, but it looks like something sort of simple going wrong that shouldnt be too bad to fix hopefully

})
.collect::<Vec<_>>();
split_points.sort();
split_points.remove(0); // skip first split point
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could also just start iterating from 1 below instead of this O(n) operation? but if i understand correctly this will have around 256 elements so doesnt matter too much?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid of a crash though, if the whole range happens to be the empty prefix. I can also start iterating from 1 but it makes the code feel less easy to see that it's safe.

The O(n) operation is trivial anyway because at most this will have about a couple of thousand elements.

}
important_keys
});
for shard_uid in shard_uids {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal to do this now, but do we also want to parallelize this part? we might wait for a while working on a shard that doesnt fully utilize all cores, when we could be making progress on other shards too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be of much help. This part is negligible compared to the copying-over-of-flat-state part. In practice it'll take at most a few seconds each, whereas the copying will take a few minutes.

let temp_store = temp_storage.get_hot_store();
let temp_store_path = temp_store_home.join("data");

trim_database::trim_database(store, &near_config.genesis.config, temp_store)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting this error:

thread 'main' panicked at tools/fork-network/src/cli.rs:148:88:
called `Result::unwrap()` on an `Err` value: MissingTrieValue(TrieStorage, 11111111111111111111111111111111)

to reproduce, apply the diff and run the two commands I posted in this comment: #11310 (comment)

I'm not sure exactly what is happening, but the error is in the call to prepare_memtrie_state_trimming() in trim_database()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I think I know what happened, let me fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I have some trouble running the commands and I'm not sure if I did something wrong or something else, but I uploaded a fix that I think addresses the issue (handling empty tries); do you mind trying again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah now i'm getting this:

thread 'main' panicked at tools/fork-network/src/cli.rs:148:88:
called `Result::unwrap()` on an `Err` value: MissingTrieValue(TrieStorage, 6Mswb4EnCzycL7n6bMEGaE87nnEboTj7K2BS3GyJjZbS

actually I dont know why I recommended doing the whole thing with the local mocknet stuff. You can reproduce this by just first running python3 tests/sanity/transactions.py

and then after that test is done, write something like this to /tmp/validators.json:

[{"account_id": "node0", "public_key": "ed25519:sqYjRhP3P1vzYY2nA7w2hCTZoA8hknuXpSeaReRxTKf", "amount": "1000000000000000000000000000000000"}, {"account_id": "node1", "public_key": "ed25519:4WFsnnwUj69smErNuidUEfeJs3fpHCQUJNsYzeghQVyT", "amount": "1000000000000000000000000000000000"}]

then just run the sequence of fork-network commands:

neard --home ~/.near/test4_finished fork-network init
neard --home ~/.near/test4_finished fork-network amend-access-keys
neard --home ~/.near/test4_finished fork-network set-validators --validators /tmp/validators.json
neard --home ~/.near/test4_finished fork-network finalize

@marcelo-gonzalez
Copy link
Contributor

still getting a crash like i posted in the other comment :(. also one thing I just realized, do we really want to make this the default thing that happens with fork-network finalize or would it be better to include an option telling whether we want to do this? Because with the current PR, we'd be forcing this DB trimming on the network. Is this expected to work properly in theory even if we're not loading memtries? im not familiar enough w that code to tell if it should still all be fine, but if not, then unless we make the behavior implemented in this PR an optional flag, we'd be forcing every forknet/mocknet run to load memtries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Forknet] Copy out State and FlatState instead of deleting all other columns
2 participants