Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(trie): parallel storage roots #6903

Merged
merged 1 commit into from Mar 8, 2024
Merged

feat(trie): parallel storage roots #6903

merged 1 commit into from Mar 8, 2024

Conversation

rkrasiuk
Copy link
Member

@rkrasiuk rkrasiuk commented Mar 1, 2024

Description

Supersedes #6576.
Builds on top of #6896.

Creates ParallelStateRoot and AsyncStateRoot incremental root calculators. See respective docs for details & differences.

Intended usage:
ParallelStateRoot - for internal use in the blockchain tree (integration will be done in a follow up) and externally is sync environments
AsyncStateRoot- for external use in async environments

@rkrasiuk rkrasiuk added C-enhancement New feature or request A-trie Related to Merkle Patricia Trie implementation labels Mar 1, 2024
@rkrasiuk rkrasiuk marked this pull request as ready for review March 1, 2024 15:47
@rkrasiuk rkrasiuk requested a review from Rjected March 1, 2024 16:34
Copy link
Member

@gakonst gakonst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK w me want @mattsse approval

Comment on lines 210 to 266
#[tokio::test]
async fn random_async_root() {
let manager = TaskManager::new(Handle::current());
let task_executor = Arc::new(manager.executor());

let factory = create_test_provider_factory();
let consistent_view = ConsistentDbView::new(factory.clone());

let mut rng = rand::thread_rng();
let mut state = (0..100)
.map(|_| {
let address = Address::random();
let account =
Account { balance: U256::from(rng.gen::<u64>()), ..Default::default() };
let mut storage = HashMap::<B256, U256>::default();
let has_storage = rng.gen_bool(0.7);
if has_storage {
for _ in 0..100 {
storage.insert(
B256::from(U256::from(rng.gen::<u64>())),
U256::from(rng.gen::<u64>()),
);
}
}
(address, (account, storage))
})
.collect::<HashMap<_, _>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a fuzztest with the techniques we used in the serial processor?

Comment on lines +100 to +128
let provider_ro = self.view.provider_ro()?;
let hashed_cursor_factory =
HashedPostStateCursorFactory::new(provider_ro.tx_ref(), &hashed_state_sorted);
let trie_cursor_factory = provider_ro.tx_ref();

let hashed_account_cursor =
hashed_cursor_factory.hashed_account_cursor().map_err(ProviderError::Database)?;
let trie_cursor =
trie_cursor_factory.account_trie_cursor().map_err(ProviderError::Database)?;

let walker = TrieWalker::new(trie_cursor, prefix_sets.account_prefix_set)
.with_updates(retain_updates);
let mut account_node_iter = AccountNodeIter::new(walker, hashed_account_cursor);
let mut hash_builder = HashBuilder::default().with_updates(retain_updates);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes me think we may want helpers for setting these things up cuz now it's a bit verbose

crates/trie-parallel/src/parallel_root.rs Outdated Show resolved Hide resolved
crates/trie/src/walker.rs Show resolved Hide resolved
crates/trie/src/updates.rs Show resolved Hide resolved
crates/trie/src/trie.rs Show resolved Hide resolved
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the async impl is dangerous because this can clog the tokio pool with a lot of blocking work

Comment on lines +82 to +96
let mut storage_roots = storage_root_targets
.into_par_iter()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how large would this be usually?
and how long does one iteration take?

let mut storage_roots = storage_root_targets
.into_par_iter()
.map(|(hashed_address, prefix_set)| {
let provider_ro = self.view.provider_ro()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this even necessary?
since Tx: Send+ Sync you should be able to access this in this scope so you only need to create it once

although the tx ops will sync via mutex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely, because the calculations will choke on that mutex

// Pre-calculate storage roots in parallel for accounts which were changed.
debug!(target: "trie::parallel_state_root", len = storage_root_targets.len(), "pre-calculating storage roots");
let mut storage_roots = storage_root_targets
.into_par_iter()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably a good idea to use chunks here instead to reduce ro txs and rayon overhead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the aforementioned overhead is smaller than the overhead of consecutive blocking ops imo

@rkrasiuk
Copy link
Member Author

rkrasiuk commented Mar 4, 2024

Benchmarks for state size x where x / 2 accounts have storage tries updated

State size: 1 000
Screenshot 2024-03-04 at 15 39 18

State size: 3 000
Screenshot 2024-03-04 at 15 39 53

State size: 5 000
Screenshot 2024-03-04 at 15 40 17

State size: 10 000
Screenshot 2024-03-04 at 15 40 40

@rkrasiuk
Copy link
Member Author

rkrasiuk commented Mar 4, 2024

Updated numbers with intermediate nodes committed to database:

State size: 1 000
Screenshot 2024-03-04 at 19 19 57

State size: 3 000
Screenshot 2024-03-04 at 19 20 25

State size: 5 000
Screenshot 2024-03-04 at 19 20 43

State size: 10 000
Screenshot 2024-03-04 at 19 21 01


#[cfg(feature = "parallel")]
/// Implementation of parallel state root computation.
pub mod parallel_root;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cfg after doccomments

@gakonst
Copy link
Member

gakonst commented Mar 7, 2024

Beta is cut. Should we merge and ship to benchmarkooors?

@rkrasiuk rkrasiuk enabled auto-merge March 8, 2024 13:16
@rkrasiuk rkrasiuk added this pull request to the merge queue Mar 8, 2024
Merged via the queue into main with commit 9569692 Mar 8, 2024
28 checks passed
@rkrasiuk rkrasiuk deleted the rkrasiuk/trie-parallel branch March 8, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants