Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make eth1 caching work with fast synced node #709

Merged
merged 26 commits into from
Dec 19, 2019

Conversation

pawanjay176
Copy link
Member

Issue Addressed

Addresses #637

Proposed Changes

Adds methods to fetch deposit_root and deposit_count at any block height and uses them to populate the BlockCache in place of the calls to read contract state directly at previous blocks.

@pawanjay176
Copy link
Member Author

Currently, the issue is eth1 block caching starts simultaneously along with deposit caching and therefore we get wrong values for the deposit_root and deposit_count until the deposit cache is in sync with deposits on the eth1 side.

We could potentially stall the block caching until the deposit caching is in sync to get around this.

@pawanjay176 pawanjay176 added the work-in-progress PR is a work-in-progress label Dec 11, 2019
/// and returns the deposit count as `index + 1`.
///
/// Returns 0 if no logs are present.
/// Note: This function assumes that `block_number` > `deposit_contract_deploy_block`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't think of a good way to handle the "before deposit contract deployed" case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will have to take note of when the deposit contract was deployed and return an error if we search prior to this. This scenario should never happen, only if a testnet has been badly configured.

Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it looks like you're on the right track!

The next step will be the hard part, that is implementing it into the beacon chain. I'm thinking you'll likely have to modify the following function such that it can read blocks from the block cache and "fill in" their deposit count and root from the deposit cache:

/// Calculates and returns `(new_eth1_data, all_eth1_data)` for the given `state`, based upon the
/// blocks in the `block` iterator.
///
/// `prev_eth1_hash` is the `eth1_data.block_hash` at the start of the voting period defined by
/// `state.slot`.
fn eth1_data_sets<'a, I>(
blocks: I,
prev_eth1_hash: Hash256,
voting_period_start_seconds: u64,
spec: &ChainSpec,
log: &Logger,
) -> Option<(Eth1DataBlockNumber, Eth1DataBlockNumber)>
where
I: DoubleEndedIterator<Item = &'a Eth1Block> + Clone,
{
let eth1_follow_distance = spec.eth1_follow_distance;
let in_scope_eth1_data = blocks
.rev()
.skip_while(|eth1_block| eth1_block.timestamp > voting_period_start_seconds)
.skip(eth1_follow_distance as usize)
.filter_map(|block| Some((block.clone().eth1_data()?, block.number)));
if in_scope_eth1_data
.clone()
.any(|(eth1_data, _)| eth1_data.block_hash == prev_eth1_hash)
{
let new_eth1_data = in_scope_eth1_data
.clone()
.take(eth1_follow_distance as usize);
let all_eth1_data =
in_scope_eth1_data.take_while(|(eth1_data, _)| eth1_data.block_hash != prev_eth1_hash);
Some((
HashMap::from_iter(new_eth1_data),
HashMap::from_iter(all_eth1_data),
))
} else {
error!(
log,
"The previous eth1 hash is not in cache";
"previous_hash" => format!("{:?}", prev_eth1_hash)
);
None
}
}

Given that we might be collecting several hundred blocks whilst we're voting, having the O(log n) binary search through the deposits would be nice.

Some(
self.logs
.iter()
.take_while(|log| block_number >= log.block_number)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the logs are ordered, we could do a binary search here.

@pawanjay176 pawanjay176 added the ready-for-review The code is ready for review label Dec 17, 2019
@paulhauner paulhauner self-assigned this Dec 17, 2019
@paulhauner paulhauner removed the work-in-progress PR is a work-in-progress label Dec 17, 2019
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A couple of minor comments and a potential optimization that I think is necessary.

Keen to hear your thoughts.

@@ -3,6 +3,8 @@ use eth2_hashing::hash;
use tree_hash::TreeHash;
use types::{Deposit, Hash256};

const DEPOSIT_CONTRACT_TREE_DEPTH: usize = 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This const tends to keep getting duplicated around, perhaps we can import this instead:

pub const DEPOSIT_TREE_DEPTH: usize = 32;

DepositCache {
logs: Vec::new(),
roots: Vec::new(),
// 0 to be compatible with Service::Config. Should be ideally 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we change Service::Config? Your suggestion seems quite reasonable.

pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> {
let index = self.get_deposit_count_from_cache(block_number)?;
let roots = self.roots.get(0..index as usize)?;
let tree = DepositDataTree::create(roots, index as usize, DEPOSIT_CONTRACT_TREE_DEPTH);
Copy link
Member

@paulhauner paulhauner Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned, this is going to be fairly inefficient.

For our testnet, this means that if we sync a cache of 4,096 blocks we're going to have to find the root of a list of 16,384 for each of those blocks. I just benched a tree hash root of 16k hashes at 8ms, so we're looking at 32 secs of hashing just to fill the cache.

I think a fairly easy solution to this would be to attach a DepositDataTree and a Vec<u64, Hash256> to the deposit cache. Each time we import a deposit we add the deposit to the DepositDataTree and then push the new root into the Vec. In this scenario, we incrementally build the deposit tree once and when we want to resolve a block number to a deposit root we can just binary search the Vec.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this sounds perfect 👍

@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Dec 18, 2019
}

/// Mirrors the merkle tree of deposits in the eth1 deposit contract.
///
/// Provides `Deposit` objects with merkle proofs included.
#[derive(Default)]
pub struct DepositCache {
logs: Vec<DepositLog>,
roots: Vec<Hash256>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulhauner Perhaps this should be named leafs instead of roots?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@pawanjay176 pawanjay176 added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Dec 18, 2019
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! There's a couple of tiny comments, but I'm happy to merge this in once you've looked at them :)

/// and queries the `deposit_roots` map to get the corresponding `deposit_root`.
pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> {
let index = self.get_deposit_count_from_cache(block_number)?;
Some(self.deposit_roots.get(index as usize)?.clone())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how you indexed by deposit count, this is better than my suggestion.

}

/// Mirrors the merkle tree of deposits in the eth1 deposit contract.
///
/// Provides `Deposit` objects with merkle proofs included.
#[derive(Default)]
pub struct DepositCache {
logs: Vec<DepositLog>,
roots: Vec<Hash256>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

.binary_search_by(|deposit| deposit.block_number.cmp(&block_number));
match index {
Ok(index) => return self.logs.get(index).map(|x| x.index + 1),
Err(prev) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be next instead of prev?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup you are right

@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Dec 18, 2019
@pawanjay176 pawanjay176 added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Dec 19, 2019
@paulhauner paulhauner added ready-to-merge and removed ready-to-merge ready-for-review The code is ready for review labels Dec 19, 2019
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @pawanjay176!

I didn't think this would make it into v0.1.1 but you made it happen!

@paulhauner paulhauner merged commit 74b327b into sigp:master Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants