-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-3710: Introduce commit_purge API #3811
GH-3710: Introduce commit_purge API #3811
Conversation
The failure comes from the ScratchTrie which caches the Trie objects, rather than bytes. To ensure that no V is deserialized while deleting, we're bypassing a cache, and that leads to a failure after first key is deleted.
With a cache that operates at byte level we can still perform optimizations such as avoiding deserialization.
If an operation only uses get() instead of get_raw() it may fail while writing root to db.
Fixture was regenerated on top of commit 8038340.
This also ensures no new EraInfo objects are produced, and passes all the integration tests.
The remote assets are only compiled for x86-64 which prevents running the tests on other platforms (like ARM aarch64). Add ability to override the base URL so that users can run the tests using assets from a different location, or locally cached assets by specifying a `file://<local_directory>` URL. Signed-off-by: Alexandru Sardan <alexandru@casperlabs.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further comments, review still incomplete.
/// Outcomes: | ||
/// * Ok(Some(range)) -- these keys should be purged | ||
/// * Ok(None) -- nothing to do, either done, or there is not enough eras to purge | ||
fn calculate_purge_eras( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn calculate_purge_eras( | |
fn calculate_prune_eras( |
It seems like we prefer this process to be called "pruning", which can be seen in other places already.
And if not, then change "prune" to "purge" in other places (for consistency).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fraser999 Thoughts on renaming? There's also chainspec entry purge_batch_size
to change which we've already communicated to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Ed's preference was also "prune", so maybe worth changing now before the PR merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you renamed it to prune everywhere except this file! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course I forgot 🤦 Fixed
This regression was introduced 04de5f0 and resulted in purge starting from next switch block after activation, instead of starting within first block activation.
/// Outcomes: | ||
/// * Ok(Some(range)) -- these keys should be purged | ||
/// * Ok(None) -- nothing to do, either done, or there is not enough eras to purge | ||
fn calculate_purge_eras( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you renamed it to prune everywhere except this file! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimistic approval, given the one final comment on upgrade_scenario_11.sh is addressed.
I'll be doing a second review today, but nevertheless, we need a stamp from @dwerner and/or @EdHastingsCasperLabs on this. |
// Can't find key | ||
// Most likely this chain did not yet ran an auction, or recently completed a | ||
// prune |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is accurate, and while it is basically harmless to do so I don't think it is necessary to log this circumstance as a warn
.
As far as I recall, a new network w/ genesis at this binary or higher would never have such an entry (correct behavior). An existing network upgraded into this binary must have such an entry. In a future binary we will either remove this handling altogether, or if it is left in place it would essentially translate to noop on networks that had already previously gone thru a migration.
Am i forgetting something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment is correct, the warn!
here was useful for testing early in the development to get the math right. I removed the warning.
@@ -2084,4 +2123,39 @@ where | |||
let max_height = self.config.max_runtime_call_stack_height() as usize; | |||
RuntimeStack::new_system_call_stack(max_height) | |||
} | |||
|
|||
/// Commit a prune of leaf nodes from the tip of the merkle trie. | |||
pub fn commit_prune( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick, but could you put this new func after commit_genesis and commit_upgrade for purposes of logical grouping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pub fn write_era_summary(&mut self, key: Key, value: EraInfo) { | ||
if let Key::EraSummary = key { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we can get away w/ changing the name of this public function in 1.x
instead, consider making a new function with the new behavior and forward the old function to it, mark the old fn as deprecated, and todo! remove it in 2.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I renamed it back (although I think we would be fine, as we don't reimport RuntimeContext as pub)
error!( | ||
previous_block_height, | ||
%state_root_hash, | ||
"commit_prune: root not found"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we kill the node in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do this and do a panic. It's impossible to hit this case. CC @Fraser999
types/src/key.rs
Outdated
Key::SystemContractRegistry => { | ||
result.append(&mut SYSTEM_CONTRACT_REGISTRY_KEY.to_bytes()?) | ||
result.append(&mut PADDING_BYTES.to_bytes()?) | ||
} | ||
Key::EraSummary => { | ||
result.append(&mut PADDING_BYTES.to_bytes()?); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just | these arms together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a monster. See nitpicks but LGTM, congrats to all involved. Lets test the heck out it.
bors r+ |
3811: GH-3710: Introduce commit_purge API r=mpapierski a=mpapierski Closes #3710 This PR implements the "commit_purge" API in EE and additional machinery in the ContractRuntime component to execute it after each block. There are also improvements to the global state to ensure that deleting an entry from the tip of the trie does not involve deserialization, and additionally, scratch trie was modified to cache the bytes of a trie rather than the trie object itself to allow lazy deserialization. This should reduce the memory footprint of a large delete operation by about 6x. Currently, commit_purge API is called repeatedly after each block is executed with a slice of Key::EraInfo whose length is configured by chainspec's `purge_batch_size`. After several blocks, when there are no more Key::EraInfo entries, the process will be no-op, and at the next protocol version can be disabled by setting `purge_batch_size = 0` The purge of Key::EraInfo consists of two steps: - At the upgrade point last Key::EraInfo(activation_point_era - 1) will be copied into a Key::EraSummary. - At the end of each executed block there will be a call to commit_purge to delete N's first Key::EraInfo entries starting from 0. Co-authored-by: Michał Papierski <michal@casperlabs.io> Co-authored-by: Michał Papierski <michal@papierski.net> Co-authored-by: Alexandru Sardan <alexandru@casperlabs.io> Co-authored-by: Karan Dhareshwar <karan@casperlabs.io>
This PR was included in a batch that successfully built, but then failed to merge into release-1.4.14. It will not be retried. Additional information: {"message":"You're not authorized to push to this branch. Visit https://docs.github.com/articles/about-protected-branches/ for more information.","documentation_url":"https://docs.github.com/articles/about-protected-branches"} |
Closes #3710
This PR implements the "commit_purge" API in EE and additional machinery in the ContractRuntime component to execute it after each block.
There are also improvements to the global state to ensure that deleting an entry from the tip of the trie does not involve deserialization, and additionally, scratch trie was modified to cache the bytes of a trie rather than the trie object itself to allow lazy deserialization. This should reduce the memory footprint of a large delete operation by about 6x.
Currently, commit_purge API is called repeatedly after each block is executed with a slice of Key::EraInfo whose length is configured by chainspec's
purge_batch_size
. After several blocks, when there are no more Key::EraInfo entries, the process will be no-op, and at the next protocol version can be disabled by settingpurge_batch_size = 0
The purge of Key::EraInfo consists of two steps: