New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(epoch-sync): adding EpochSyncInfo validation tool #10146
Conversation
tools/epoch-sync/Cargo.toml
Outdated
near-store.workspace = true | ||
|
||
[features] | ||
new_epoch_sync = ["nearcore/new_epoch_sync"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this crate currently only does anything at all when this feature is enabled, wouldn’t it be better if the dependency on the crate itself was optional over implementing a crate feature like this right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried my best :) 67aea72
Please, check this commit
neard/Cargo.toml
Outdated
@@ -40,6 +40,7 @@ near-config-utils.workspace = true | |||
near-crypto.workspace = true | |||
near-database-tool.workspace = true | |||
near-dyn-configs.workspace = true | |||
near-epoch-sync-tool = { path = "../tools/epoch-sync", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was thinking that the path would remain in the Cargo.toml
, just when specyfing an optional dependency there’d be
near-epoch-sync-tool = { workspace = true, optional = true }
There are a couple of examples of this in e.g. nearcore/Cargo.toml
around near-jsonrpc
dependency. Is there any reason why you have chosen to do it this way instead? Did something not work the way you expected?
tools/epoch-sync/Cargo.toml
Outdated
near-store.workspace = true | ||
|
||
[features] | ||
default = ["nearcore/new_epoch_sync"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a required feature for this crate to work, I believe you want
nearcore = { workspace = true, features = ["new_epoch_sync"] }
The rule of thumb to use in making the choice between these two options is: "would this crate compile if the default features were not enabled for some reason (e.g. somebody built code with the --no-default-features
flag)?" If the answer is "yes", then the current suggested approach is fine. Otherwise the feature has to be enabled unconditionally as suggested in this comment.
let mut hash_to_next_hash = HashMap::new(); | ||
let mut cur_head = *chain_head; | ||
while let Some(prev_hash) = hash_to_prev_hash.get(&cur_head) { | ||
hash_to_next_hash.insert(*prev_hash, cur_head); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can assert!()
that the insertion happened. This would detect forks in the recent epochs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, with forks, the mapping from cur_hash
to next_hash
is not correct, as we may encounter key collision.
To avoid that I don't build this mapping from the whole hash_to_prev_hash
mapping, but only from one chain that end with provided chain_head
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this as a comment to the code.
fn get_hash_to_prev_hash_from_block_header( | ||
storage: &NodeStorage, | ||
) -> anyhow::Result<HashMap<CryptoHash, CryptoHash>> { | ||
let mut hash_to_prev_hash = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you are iterating over all headers available, I think you may encounter forks.
Are forks garbage collected in the normal garbage collection or sooner (in finalization of blocks?)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with forks mapping from cur_hash
to prev_hash
is unique, as there will not be two blocks with the same hash, so there will not be key collision. Value collision may happen, but it is irrelevant for HashMap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this as a comment to the code.
|
||
// Edge case if we exactly at the epoch boundary. | ||
if epoch_ids.contains(&cur_hash) { | ||
cur_hash = hash_to_prev_hash[&cur_hash]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it ok to skip the header head block?
Don't we need to check that the last block of an epoch exists and has block info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EpochSyncInfo
is only constructed when the next epoch has a finalised block, because we need a header with epoch_sync_data_hash
present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, please add that as a comment to the code.
let next_epoch_first_hash = hash_to_next_hash[last_header.hash()]; | ||
tracing::debug!("Creating EpochSyncInfo from block {:?}", last_header); | ||
|
||
let epoch_sync_info = chain_update.create_epoch_sync_info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm lost.
The function is called validate
, but you're not reading any EpochSyncInfo from storage, but create them and validate the ones created?
A few comments explaining what you're validating will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will add the comment. I validate the epoch sync implementation, not the DB.
|
||
// Edge case if we exactly at the epoch boundary. | ||
if epoch_ids.contains(&cur_hash) { | ||
cur_hash = hash_to_prev_hash[&cur_hash]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, please add that as a comment to the code.
fn get_hash_to_prev_hash_from_block_header( | ||
storage: &NodeStorage, | ||
) -> anyhow::Result<HashMap<CryptoHash, CryptoHash>> { | ||
let mut hash_to_prev_hash = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this as a comment to the code.
let mut hash_to_next_hash = HashMap::new(); | ||
let mut cur_head = *chain_head; | ||
while let Some(prev_hash) = hash_to_prev_hash.get(&cur_head) { | ||
hash_to_next_hash.insert(*prev_hash, cur_head); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this as a comment to the code.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10146 +/- ##
==========================================
- Coverage 72.04% 71.89% -0.16%
==========================================
Files 706 707 +1
Lines 141305 141788 +483
Branches 141305 141788 +483
==========================================
+ Hits 101804 101937 +133
- Misses 34807 35131 +324
- Partials 4694 4720 +26
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Adding tool for testing epoch sync implementation.
Right now only one subcommand is available
validate-epoch-sync-info
. It constructsEpochSyncInfo
for every finalised epoch in DB and checks thatEpochSyncInfo
passes validation.Currently we only have one validation (kinda) implemented -- checking of
epoch_sync_data_hash
.As epoch sync is under feature and that feature includes DB upgrade, the tool first creates a snapshot of the DB, and then modifies the config to work with the snapshot.