Skip to content

feat(state bridge): gossip contract bytecode and storage#1319

Merged
KolbyML merged 1 commit intoethereum:masterfrom
KolbyML:next-state-pr
Jun 20, 2024
Merged

feat(state bridge): gossip contract bytecode and storage#1319
KolbyML merged 1 commit intoethereum:masterfrom
KolbyML:next-state-pr

Conversation

@KolbyML
Copy link
Copy Markdown
Member

@KolbyML KolbyML commented Jun 15, 2024

What was wrong?

We weren't gossiping contract bytecode and storage, Milos really wanted me to do this before working on being able to execute to the front of the chain. It isn't really that big of a deal, so I am getting this done quick

How was it fixed?

By adding code to gossip the contract bytecode

For storage because the storage root is updated multiple times per a block, we maintain a cache of updated storage keys per an account.

This implementation is meant to be simple and work, I am aware of optimizations, but those are yet to be seen as needed so we will wait util we run into walls and go from there.

@KolbyML KolbyML force-pushed the next-state-pr branch 2 times, most recently from b4e3235 to afae361 Compare June 16, 2024 09:20
@KolbyML KolbyML requested a review from morph-dev June 16, 2024 22:49
@KolbyML KolbyML marked this pull request as ready for review June 16, 2024 22:49
@KolbyML KolbyML self-assigned this Jun 17, 2024
Copy link
Copy Markdown
Collaborator

@morph-dev morph-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say no major issues, but will approve once #1318 is merged.


// gossip contract bytecode
let code = state.database.code_by_hash_ref(account.code_hash)?;
self.gossip_contract_bytecode(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we gossip contract's code every block its storage is updated?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple/naive implementation that works. We can optimize it when it becomes a bottleneck, but for now gossiping it every block the contract's storage is touched isn't an issue

B256::from_slice(&raw_address_hash)
}

pub fn address_to_path(address: Address) -> Vec<u8> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe address_to_nibbles or address_to_nibble_path?

))
}

pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe nibble_path_to_packed_path?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't make sense to me as we are getting the address's hash or keccak(addressO

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is very generic, and it's not aware and it doesn't care if input is address, hash of address, hash of storage slot, or anything else.

What it does is:

  • takes byte slice that should be 64 bytes long (otherwise it panics at B256::from_slice(...))
    • actually even 63 bytes would work, but it shouldn't, so maybe add the length check
    • and maybe return Result if input is wrong, instead of panicing
  • assumes every byte is a nibble (i.e. <16)
    • maybe there should be a check for this, otherwise there would be weird result
  • packs pairs of two consecutive nibbles in a byte
  • converts resulting 32 bytes into B256

That's why, I would give it generic name, so it can be reused in the future for other similar things. And nibble_path_to_packed_path (or full_nibble_path_to_packed_path) describes it very well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever be used for something else though? I think we should give it a generic name if we ever use it in a generic way

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's generic enough that I would even extracted into separate file (e.g. utils.rs). Reason is that if I'm writing new code and I feel like there might be generic function that I need to use (and I would say this would fall into that category), I'm going to look for it in some library or some utils.rs or similar, or by searching by name (and I'm not likely going to find it inside content.rs with full_key_path_to_address_hash name). Of course this assumes that when you are writing code, you are not familiar with the entire codebase, which is the case in general.

So it is good coding practice to extract generic functions and give them generic names, even if they are currently used only in one place.

I'm not too keen on this, so it's up to you.

Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.

I will add this as I think it is good

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move it to utils.rs. I will change the name from full_key_path_to_address_hash to full_nibble_path_to_address_hash. I see this function as an inversion to the util function address_to_nibble_path. I want to keep address_hash in the name because I want when people read full_nibble_path_to_address_hash to understand what it is used for. I find full_nibble_path_to_packed_path very confusing because the name is so generic I don't understand what it would be used for when reading it.

So if we find a more generic usecase for it in the future we can update it to the generic name. But for now I will do the inbetween name which still contains address_hash as you said it is up to me. We still maintain the possibility of changing it depending on what the future brings

pub struct StateConfig {
/// This flag when enabled will storage all the trie keys from block execution in a cache, this
/// is needed for gossiping the storage trie's changes.
pub cache_contract_storage_changes: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this flag? In which case we don't want this?

Copy link
Copy Markdown
Member Author

@KolbyML KolbyML Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only want this flag enabled if we are gossiping storage changes. I want to build an execution client so by default this feature for our bridge wouldn't be enabled.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will start by saying that I don't expect you to change anything in this PR, but to keep in mind for future work and to clarify my confusion.

I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?

Also, I think it would make sense to call it something like gossip_state_changes, and have a another config (e.g. EvmConfig) that would be initialized with this value.

Copy link
Copy Markdown
Member Author

@KolbyML KolbyML Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?

This code will be moved into a crate called trin-execution. If we are able to execute to the head of the chain we built an execution client. If we are building an execution client (which we are as a byproduct of gossiping state at the head of the chain), then we can release our execution client to the public for others to use. It will be an execution client like Besu, Geth, Reth, Nethermind. The difference is we will be building it without devp2p and will never add devp2p. Devp2p is likely to be dropped sooner or later.

Copy link
Copy Markdown
Member Author

@KolbyML KolbyML Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think it would make sense to call it something like gossip_state_changes

This argument is only needed for gossip storage trie changes, as the account trie we can gossip it for free. The storage trie's require storing extra stuff in the database, the majority of users won't need this so it is off by default

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code will be moved into a crate called trin-execution

Yes, this makes sense. It's something I wanted to suggest as well. Thank you for clarifying.

self.db
.put(
keccak256(
[keccak256(address).as_slice(), b"reverse hash lookup"].concat(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe make "reverse hash lookup" a const and maybe make it a prefix to the keccak?

async fn gossip_storage(
&self,
account_proof: &AccountProof,
storage_proof: &AccountProof,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I commented on the other PR, but AccountProof should really be called TrieProof (or something similar) and it shows here.

@KolbyML KolbyML force-pushed the next-state-pr branch 2 times, most recently from 79552b2 to bd1e8b1 Compare June 19, 2024 10:41
@KolbyML
Copy link
Copy Markdown
Member Author

KolbyML commented Jun 19, 2024

@morph-dev ready for another look let me know if I missed anything


// check if node is a leaf then check if the leaf is for a contract, if it is gossip
// the contract bytecode and storage
if let Some(encoded_last_node) = account_proof.proof.last() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be

let Some(encoded_last_node) = account_proof.proof.last() else {
  bail!(...)
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of bail!() I will use continue, as we are not meant to break out the loop, thank you for showing me this cool way to prevent indents

if let Some(encoded_last_node) = account_proof.proof.last() {
let decoded_node = decode_node(&mut encoded_last_node.as_ref())
.expect("Should should only be passing valid encoded nodes");
if let Node::Leaf(leaf) = decoded_node {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

let Node::Leaf(leaf) = decoded_node else {
  bail!(...)
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of bail!() I will use continue, as we are not meant to break out the loop, thank you for showing me this cool way to prevent indents

pub struct StateConfig {
/// This flag when enabled will storage all the trie keys from block execution in a cache, this
/// is needed for gossiping the storage trie's changes.
pub cache_contract_storage_changes: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will start by saying that I don't expect you to change anything in this PR, but to keep in mind for future work and to clarify my confusion.

I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?

Also, I think it would make sense to call it something like gossip_state_changes, and have a another config (e.g. EvmConfig) that would be initialized with this value.

account_proof: &TrieProof,
code: Bytecode,
) -> anyhow::Result<StateContentValue> {
let account_proof = EncodedTrieProof::from(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably extract TrieProof out of trie_walker.rs and do something like this:

impl From<&TrieProof> for EncodedTrieProof { ... }

That way you don't have to duplicate this conversion logic in so many places.

You can also add other useful / common functions to it (e.g. getting the last node, converting path to Nibbles).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do the main suggestion now

and we can do

You can also add other useful / common functions to it (e.g. getting the last node, converting path to Nibbles).

In the future if it seems needed

))
}

pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is very generic, and it's not aware and it doesn't care if input is address, hash of address, hash of storage slot, or anything else.

What it does is:

  • takes byte slice that should be 64 bytes long (otherwise it panics at B256::from_slice(...))
    • actually even 63 bytes would work, but it shouldn't, so maybe add the length check
    • and maybe return Result if input is wrong, instead of panicing
  • assumes every byte is a nibble (i.e. <16)
    • maybe there should be a check for this, otherwise there would be weird result
  • packs pairs of two consecutive nibbles in a byte
  • converts resulting 32 bytes into B256

That's why, I would give it generic name, so it can be reused in the future for other similar things. And nibble_path_to_packed_path (or full_nibble_path_to_packed_path) describes it very well.

@KolbyML
Copy link
Copy Markdown
Member Author

KolbyML commented Jun 20, 2024

@morph-dev ready for another look, let me know if I missed anything

Copy link
Copy Markdown
Collaborator

@morph-dev morph-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

// check if node is a leaf then check if the leaf is for a contract, if it is gossip
// the contract bytecode and storage
let Some(encoded_last_node) = account_proof.proof.last() else {
continue;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we should at least print error here

self.gossip_account(&account_proof, block_tuple.header.header.hash())
.await?;

// check if node is a leaf then check if the leaf is for a contract, if it is gossip
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment seems out of place now. Maybe remove it and add appropriate comments where they are needed

.map(|bytes| EncodedTrieNode::from(bytes.to_vec()))
.collect::<Vec<EncodedTrieNode>>(),
);
let trie_proof = EncodedTrieProof::from(account_proof);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should remove this line and simplify this ever further:

Ok(StateContentValue::AccountTrieNodeWithProof(
    AccountTrieNodeWithProof {
        proof: account_proof.into(),
        block_hash,
    },
))

Also in several places below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

))
}

pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's generic enough that I would even extracted into separate file (e.g. utils.rs). Reason is that if I'm writing new code and I feel like there might be generic function that I need to use (and I would say this would fall into that category), I'm going to look for it in some library or some utils.rs or similar, or by searching by name (and I'm not likely going to find it inside content.rs with full_key_path_to_address_hash name). Of course this assumes that when you are writing code, you are not familiar with the entire codebase, which is the case in general.

So it is good coding practice to extract generic functions and give them generic names, even if they are currently used only in one place.

I'm not too keen on this, so it's up to you.

Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.

pub struct StateConfig {
/// This flag when enabled will storage all the trie keys from block execution in a cache, this
/// is needed for gossiping the storage trie's changes.
pub cache_contract_storage_changes: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code will be moved into a crate called trin-execution

Yes, this makes sense. It's something I wanted to suggest as well. Thank you for clarifying.

@KolbyML KolbyML merged commit ac6ea6f into ethereum:master Jun 20, 2024
@@ -0,0 +1,52 @@
use alloy_primitives::{keccak256, Address, B256};

pub fn full_nibble_path_to_address_hash(key_path: &[u8]) -> B256 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Most (I would say all not-trivial) public functions should have documentation. This also "fixes" the confusion with having more generic names. This greatly helps with coding because of the IDE.

In this case it is more acceptable to keep your name because we have address_to_nibble_path next to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants