New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] The Merge #2229
[WIP] The Merge #2229
Conversation
Marking as "do not merge" on request by @mkalinin. Looking forward to the merge of the merge though! |
```python | ||
def process_application_payload(state: BeaconState, body: BeaconBlockBody) -> None: | ||
""" | ||
Note: This function is designed to be able to be run in parallel with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting that the other process_block
functions in eth2 take ~2ms on average hardware - pragmatically there's not that much to be gained from parallelization here - I would guess this falls in the category of being a "nice touch" - it makes testing and reasoning easier for example when functions are "pure" in the order-independent sense.
In general, making the eth1 state transition synchronous here leaves smaller margins for attestation production, an issue we're already seeing during finalization today, when the beacon chain is doing "nothing" - it would be good to define behaviors / describe more in the validator guide what the client should do when eth1 is momentarily unable to produce a block on time (for whatever reason, hickup in the client, hardware etc), what the next proposer should do and other failure scenarios, as well as investigate / simulate tolerances for failures and delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the above comment is not taking into account future eth2 data requirements of course - the 2ms time is from current load which is mainly CPU-bound - with more data and shards, these timings may change dramatically, but at that point eth1 and eth2 might end up competing for the same (scarce) resource, disk, which also is hard to parallelize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerns you have listed above are important and addressing them is a part of readiness criteria for consensus upgrade. Transaction throughput (the size of the application block) is one of the evaluation criteria of the chain after the upgrade, attesters are in a heavy dependency on this property and time restriction that we have throughout a slot turns into restriction on the block size. The other potential approach to pushing the limits is delayed execution which comes with new complexity related to verification of gas limit per block.
Proposers may follow the pending block technique in a way it is currently implemented by Ethereum clients which means that application block is always ready in advance and proposer just needs to pick it up.
The other side of the problem of synchronous interaction between consensus and application chains is resource exhaustion attack vector, its analysis and mitigation.
All the above is out of the scope of this spec but definitely is on the critical path to the upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is that the primary attestation issue today is seen on epoch boundaries where the epoch transition must be processed before the block is created and before the block is gossiped at each step (to get the correct proposer shuffling to validate the signature).
My estimation is that this is causing late receiving of blocks across the network and thus the incorrect/late attestations. This epoch (shuffling) processing problem is independent of computation you do after block is gossiped/forwarded, and the epoch processing issue is a major target for optimizations.
If the epoch boundary block can be gossiped quickly (same speed as non-epoch boundary blocks), then there should be ample time (3s+) to compute the block's payload to attest. Eth1 blocks today are processed in 200ms so as long as you don't do that processing as a gossip validation, I don't see this compounding our current issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note! another parallelization design goal is that it ensures the application layer could be pulled out into a shard. By reducing the same-slot-processing dependencies between beacon and application-layer, we keep future designs open
) | ||
|
||
application_state = get_application_state(state.application_state_root) | ||
application_state_transition(application_state, beacon_chain_data, body.application_payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the beacon node holding the application state?
It seems to me that we can just do:
application_state_transition(state.application_state_root, beacon_chain_data, body.application_payload)
That saves 1 round trip of retrieving and passing the application state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the beacon node holding the application state?
This is implementation dependant. If there is a client that connected to both networks (consensus and application) then it has direct access to the application state. The other type of client has separated pieces of software representing consensus and application parts, these pieces are interacting via RPC protocol. This is the reason behind keeping the spec abstract in that regard.
The other reason for that is the difference between commitment schemes of beacon state and application state, that is binary tree vs MPT. If the scheme were the same we could think of embedding application state into the beacon state. Then application state would be a part of the beacon state for the monolithic type of clients and could be still represented as a state root for clients of separated architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that we can just do:
application_state_transition(state.application_state_root, beacon_chain_data, body.application_payload)That saves 1 round trip of retrieving and passing the application state
Currently, it follows a logic similar to beacon chain state_transition
function that accepts the (state, block)
tuple and mutates the state by applying the state transition.
The expedited quick merge described by Vitalik (see https://notes.ethereum.org/@vbuterin/B1mUf6DXO) is different and would happen before this PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick pass. a couple of nits and some comments
```python | ||
def process_application_payload(state: BeaconState, body: BeaconBlockBody) -> None: | ||
""" | ||
Note: This function is designed to be able to be run in parallel with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is that the primary attestation issue today is seen on epoch boundaries where the epoch transition must be processed before the block is created and before the block is gossiped at each step (to get the correct proposer shuffling to validate the signature).
My estimation is that this is causing late receiving of blocks across the network and thus the incorrect/late attestations. This epoch (shuffling) processing problem is independent of computation you do after block is gossiped/forwarded, and the epoch processing issue is a major target for optimizations.
If the epoch boundary block can be gossiped quickly (same speed as non-epoch boundary blocks), then there should be ample time (3s+) to compute the block's payload to attest. Eth1 blocks today are processed in 200ms so as long as you don't do that processing as a gossip validation, I don't see this compounding our current issue.
```python | ||
def process_application_payload(state: BeaconState, body: BeaconBlockBody) -> None: | ||
""" | ||
Note: This function is designed to be able to be run in parallel with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note! another parallelization design goal is that it ensures the application layer could be pulled out into a shard. By reducing the same-slot-processing dependencies between beacon and application-layer, we keep future designs open
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
closing in favor of #2257 |
The Merge
Work in progress. Introduces spec changes to implement the executable beacon chain proposal.
Executable beacon chain is the version of Ethereum Mainnet with PoS consensus driven by the Beacon Chain. Consensus Upgrade (from PoW to PoS) is the core change that is coming with the Merge.
Based on
Related
Implementations
TODO