feat(state bridge): gossip contract bytecode and storage#1319
feat(state bridge): gossip contract bytecode and storage#1319KolbyML merged 1 commit intoethereum:masterfrom
Conversation
b4e3235 to
afae361
Compare
portal-bridge/src/bridge/state.rs
Outdated
|
|
||
| // gossip contract bytecode | ||
| let code = state.database.code_by_hash_ref(account.code_hash)?; | ||
| self.gossip_contract_bytecode( |
There was a problem hiding this comment.
why do we gossip contract's code every block its storage is updated?
There was a problem hiding this comment.
This is a simple/naive implementation that works. We can optimize it when it becomes a bottleneck, but for now gossiping it every block the contract's storage is touched isn't an issue
| B256::from_slice(&raw_address_hash) | ||
| } | ||
|
|
||
| pub fn address_to_path(address: Address) -> Vec<u8> { |
There was a problem hiding this comment.
nit: maybe address_to_nibbles or address_to_nibble_path?
| )) | ||
| } | ||
|
|
||
| pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 { |
There was a problem hiding this comment.
nit: maybe nibble_path_to_packed_path?
There was a problem hiding this comment.
That doesn't make sense to me as we are getting the address's hash or keccak(addressO
There was a problem hiding this comment.
This function is very generic, and it's not aware and it doesn't care if input is address, hash of address, hash of storage slot, or anything else.
What it does is:
- takes byte slice that should be 64 bytes long (otherwise it panics at
B256::from_slice(...))- actually even 63 bytes would work, but it shouldn't, so maybe add the length check
- and maybe return
Resultif input is wrong, instead of panicing
- assumes every byte is a nibble (i.e. <16)
- maybe there should be a check for this, otherwise there would be weird result
- packs pairs of two consecutive nibbles in a byte
- converts resulting 32 bytes into
B256
That's why, I would give it generic name, so it can be reused in the future for other similar things. And nibble_path_to_packed_path (or full_nibble_path_to_packed_path) describes it very well.
There was a problem hiding this comment.
Would this ever be used for something else though? I think we should give it a generic name if we ever use it in a generic way
There was a problem hiding this comment.
I think it's generic enough that I would even extracted into separate file (e.g. utils.rs). Reason is that if I'm writing new code and I feel like there might be generic function that I need to use (and I would say this would fall into that category), I'm going to look for it in some library or some utils.rs or similar, or by searching by name (and I'm not likely going to find it inside content.rs with full_key_path_to_address_hash name). Of course this assumes that when you are writing code, you are not familiar with the entire codebase, which is the case in general.
So it is good coding practice to extract generic functions and give them generic names, even if they are currently used only in one place.
I'm not too keen on this, so it's up to you.
Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.
There was a problem hiding this comment.
Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.
I will add this as I think it is good
There was a problem hiding this comment.
I will move it to utils.rs. I will change the name from full_key_path_to_address_hash to full_nibble_path_to_address_hash. I see this function as an inversion to the util function address_to_nibble_path. I want to keep address_hash in the name because I want when people read full_nibble_path_to_address_hash to understand what it is used for. I find full_nibble_path_to_packed_path very confusing because the name is so generic I don't understand what it would be used for when reading it.
So if we find a more generic usecase for it in the future we can update it to the generic name. But for now I will do the inbetween name which still contains address_hash as you said it is up to me. We still maintain the possibility of changing it depending on what the future brings
| pub struct StateConfig { | ||
| /// This flag when enabled will storage all the trie keys from block execution in a cache, this | ||
| /// is needed for gossiping the storage trie's changes. | ||
| pub cache_contract_storage_changes: bool, |
There was a problem hiding this comment.
Why do we need this flag? In which case we don't want this?
There was a problem hiding this comment.
We only want this flag enabled if we are gossiping storage changes. I want to build an execution client so by default this feature for our bridge wouldn't be enabled.
There was a problem hiding this comment.
I will start by saying that I don't expect you to change anything in this PR, but to keep in mind for future work and to clarify my confusion.
I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?
Also, I think it would make sense to call it something like gossip_state_changes, and have a another config (e.g. EvmConfig) that would be initialized with this value.
There was a problem hiding this comment.
I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?
This code will be moved into a crate called trin-execution. If we are able to execute to the head of the chain we built an execution client. If we are building an execution client (which we are as a byproduct of gossiping state at the head of the chain), then we can release our execution client to the public for others to use. It will be an execution client like Besu, Geth, Reth, Nethermind. The difference is we will be building it without devp2p and will never add devp2p. Devp2p is likely to be dropped sooner or later.
There was a problem hiding this comment.
Also, I think it would make sense to call it something like gossip_state_changes
This argument is only needed for gossip storage trie changes, as the account trie we can gossip it for free. The storage trie's require storing extra stuff in the database, the majority of users won't need this so it is off by default
There was a problem hiding this comment.
This code will be moved into a crate called
trin-execution
Yes, this makes sense. It's something I wanted to suggest as well. Thank you for clarifying.
| self.db | ||
| .put( | ||
| keccak256( | ||
| [keccak256(address).as_slice(), b"reverse hash lookup"].concat(), |
There was a problem hiding this comment.
nit: maybe make "reverse hash lookup" a const and maybe make it a prefix to the keccak?
portal-bridge/src/bridge/state.rs
Outdated
| async fn gossip_storage( | ||
| &self, | ||
| account_proof: &AccountProof, | ||
| storage_proof: &AccountProof, |
There was a problem hiding this comment.
I think I commented on the other PR, but AccountProof should really be called TrieProof (or something similar) and it shows here.
79552b2 to
bd1e8b1
Compare
|
@morph-dev ready for another look let me know if I missed anything |
portal-bridge/src/bridge/state.rs
Outdated
|
|
||
| // check if node is a leaf then check if the leaf is for a contract, if it is gossip | ||
| // the contract bytecode and storage | ||
| if let Some(encoded_last_node) = account_proof.proof.last() { |
There was a problem hiding this comment.
this should probably be
let Some(encoded_last_node) = account_proof.proof.last() else {
bail!(...)
}There was a problem hiding this comment.
Instead of bail!() I will use continue, as we are not meant to break out the loop, thank you for showing me this cool way to prevent indents
portal-bridge/src/bridge/state.rs
Outdated
| if let Some(encoded_last_node) = account_proof.proof.last() { | ||
| let decoded_node = decode_node(&mut encoded_last_node.as_ref()) | ||
| .expect("Should should only be passing valid encoded nodes"); | ||
| if let Node::Leaf(leaf) = decoded_node { |
There was a problem hiding this comment.
Also here
let Node::Leaf(leaf) = decoded_node else {
bail!(...)
}There was a problem hiding this comment.
Instead of bail!() I will use continue, as we are not meant to break out the loop, thank you for showing me this cool way to prevent indents
| pub struct StateConfig { | ||
| /// This flag when enabled will storage all the trie keys from block execution in a cache, this | ||
| /// is needed for gossiping the storage trie's changes. | ||
| pub cache_contract_storage_changes: bool, |
There was a problem hiding this comment.
I will start by saying that I don't expect you to change anything in this PR, but to keep in mind for future work and to clarify my confusion.
I'm not sure what you mean by "be an execution client", considering that all this code is inside portal-bridge. Can you explain this a bit more?
Also, I think it would make sense to call it something like gossip_state_changes, and have a another config (e.g. EvmConfig) that would be initialized with this value.
| account_proof: &TrieProof, | ||
| code: Bytecode, | ||
| ) -> anyhow::Result<StateContentValue> { | ||
| let account_proof = EncodedTrieProof::from( |
There was a problem hiding this comment.
You should probably extract TrieProof out of trie_walker.rs and do something like this:
impl From<&TrieProof> for EncodedTrieProof { ... }That way you don't have to duplicate this conversion logic in so many places.
You can also add other useful / common functions to it (e.g. getting the last node, converting path to Nibbles).
There was a problem hiding this comment.
I will do the main suggestion now
and we can do
You can also add other useful / common functions to it (e.g. getting the last node, converting path to Nibbles).
In the future if it seems needed
| )) | ||
| } | ||
|
|
||
| pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 { |
There was a problem hiding this comment.
This function is very generic, and it's not aware and it doesn't care if input is address, hash of address, hash of storage slot, or anything else.
What it does is:
- takes byte slice that should be 64 bytes long (otherwise it panics at
B256::from_slice(...))- actually even 63 bytes would work, but it shouldn't, so maybe add the length check
- and maybe return
Resultif input is wrong, instead of panicing
- assumes every byte is a nibble (i.e. <16)
- maybe there should be a check for this, otherwise there would be weird result
- packs pairs of two consecutive nibbles in a byte
- converts resulting 32 bytes into
B256
That's why, I would give it generic name, so it can be reused in the future for other similar things. And nibble_path_to_packed_path (or full_nibble_path_to_packed_path) describes it very well.
|
@morph-dev ready for another look, let me know if I missed anything |
| // check if node is a leaf then check if the leaf is for a contract, if it is gossip | ||
| // the contract bytecode and storage | ||
| let Some(encoded_last_node) = account_proof.proof.last() else { | ||
| continue; |
There was a problem hiding this comment.
nit: I think we should at least print error here
portal-bridge/src/bridge/state.rs
Outdated
| self.gossip_account(&account_proof, block_tuple.header.header.hash()) | ||
| .await?; | ||
|
|
||
| // check if node is a leaf then check if the leaf is for a contract, if it is gossip |
There was a problem hiding this comment.
nit: this comment seems out of place now. Maybe remove it and add appropriate comments where they are needed
| .map(|bytes| EncodedTrieNode::from(bytes.to_vec())) | ||
| .collect::<Vec<EncodedTrieNode>>(), | ||
| ); | ||
| let trie_proof = EncodedTrieProof::from(account_proof); |
There was a problem hiding this comment.
I think you should remove this line and simplify this ever further:
Ok(StateContentValue::AccountTrieNodeWithProof(
AccountTrieNodeWithProof {
proof: account_proof.into(),
block_hash,
},
))Also in several places below.
| )) | ||
| } | ||
|
|
||
| pub fn full_key_path_to_address_hash(key_path: &[u8]) -> B256 { |
There was a problem hiding this comment.
I think it's generic enough that I would even extracted into separate file (e.g. utils.rs). Reason is that if I'm writing new code and I feel like there might be generic function that I need to use (and I would say this would fall into that category), I'm going to look for it in some library or some utils.rs or similar, or by searching by name (and I'm not likely going to find it inside content.rs with full_key_path_to_address_hash name). Of course this assumes that when you are writing code, you are not familiar with the entire codebase, which is the case in general.
So it is good coding practice to extract generic functions and give them generic names, even if they are currently used only in one place.
I'm not too keen on this, so it's up to you.
Also, you didn't address other comments regarding checking the validity of the input (length can be 63, bytes can have invalid values), which I would might be useful/good even if it is not generic.
| pub struct StateConfig { | ||
| /// This flag when enabled will storage all the trie keys from block execution in a cache, this | ||
| /// is needed for gossiping the storage trie's changes. | ||
| pub cache_contract_storage_changes: bool, |
There was a problem hiding this comment.
This code will be moved into a crate called
trin-execution
Yes, this makes sense. It's something I wanted to suggest as well. Thank you for clarifying.
| @@ -0,0 +1,52 @@ | |||
| use alloy_primitives::{keccak256, Address, B256}; | |||
|
|
|||
| pub fn full_nibble_path_to_address_hash(key_path: &[u8]) -> B256 { | |||
There was a problem hiding this comment.
nit: Most (I would say all not-trivial) public functions should have documentation. This also "fixes" the confusion with having more generic names. This greatly helps with coding because of the IDE.
In this case it is more acceptable to keep your name because we have address_to_nibble_path next to it.
What was wrong?
We weren't gossiping contract bytecode and storage, Milos really wanted me to do this before working on being able to execute to the front of the chain. It isn't really that big of a deal, so I am getting this done quick
How was it fixed?
By adding code to gossip the contract bytecode
For storage because the storage root is updated multiple times per a block, we maintain a cache of updated storage keys per an account.
This implementation is meant to be simple and work, I am aware of optimizations, but those are yet to be seen as needed so we will wait util we run into walls and go from there.