Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on get_pow_block #2636

Closed
paulhauner opened this issue Sep 29, 2021 · 16 comments
Closed

Clarification on get_pow_block #2636

paulhauner opened this issue Sep 29, 2021 · 16 comments
Labels
Bellatrix CL+EL Merge

Comments

@paulhauner
Copy link
Contributor

paulhauner commented Sep 29, 2021

The spec defines a get_pow_block function which returns a PoWBlock structure. In practice, I expect get_pow_block to map to the eth_getBlockByHash JSON-RPC call.

Assuming we use eth_getBlockByHash, does get_pow_block filter the result so that it only contains pre-merge, non-PoS blocks?

From the naming, I think it should apply the filter.

However, if we look at get_block_at_terminal_total_difficulty, we can see it iterates through the entire pow_chain. This communicates that there might be multiple blocks in pow_chain where block.total_difficulty >= TERMINAL_TOTAL_DIFFICULTY. This conflicts with EIP-3675 which states:

PoW blocks that are descendants of any terminal PoW block MUST NOT be imported. This implies that a terminal PoW block will be the last PoW block in the canonical chain.

I raise this for the following reasons:

  1. In an implementation it is desirable to bound the number of loops that get_block_at_terminal_total_difficulty can perform. When doing recursive look-ups to a remote server it's desirable to clearly understand the properties of that recursion.
  2. I think the specification should be clearer on get_pow_block to avoid differences in implementation.
  3. If get_pow_block also returns PoS blocks, then it seems like it needs a different name.

On a side note, if get_pow_block should filter the results, how does it do that? I assume there's some simple way, but I haven't seen it in the API docs I've been looking at.

@djrtwo
Copy link
Contributor

djrtwo commented Sep 29, 2021

EDIT: I disagree with myself based on discord discussion

You're right.

If the MUST NOT condition is followed on the execution-client, then the head of the pow chain is simply either (a) the terminal block OR (b) not yet the terminal block. And in such a case, just a call to eth_getBlockByNumber("latest") should be sufficient to implement this function.

We should consider simplifying the spec here by calling block = get_pow_head() (or just pow_chain[0]) and parent = get_pow_block(block.parent_hash) with no loop. From a specification standpoint, this would be putting more trust in the execution-layer MUST NOT but the unification of the spec is the two specs in concert so this doesn't seem crazy

@djrtwo
Copy link
Contributor

djrtwo commented Sep 29, 2021

Some relevant discussion on discord

We need to specify what the consensus and execution layer do when the consensus re-orgs from a transitioned chain to an non-transitioned chain.

@paulhauner suggests sending 0x00..00 which I think I agree with. This would trigger the execution layer to revert to a TD fork choice on the pow chain (no pos blocks). And to still support the "MUST NOT import beyond TTD blocks"

In a 1:1 consensus to execution relationship, I do think that this clears up the ambiguity in the spec around item (3). That is, you would never be trying to build an initial execution payload on an execution-layer that thought PoS blocks were in the canonical chain.

That said, this does not support many:1 consesnsu to execution, but our general set fork choice construction in the API and specs already does not support this.

@paulhauner
Copy link
Contributor Author

paulhauner commented Sep 29, 2021

That said, this does not support many:1 consesnsu to execution, but our general set fork choice construction in the API and specs already does not support this.

This is likely tangential, but we'll need to consider mechanisms on the APIs to lock to 1:1 if that's how we're going to specify.

@mkalinin
Copy link
Collaborator

@paulhauner suggests sending 0x00..00 which I think I agree with. This would trigger the execution layer to revert to a TD fork choice on the pow chain (no pos blocks). And to still support the "MUST NOT import beyond TTD blocks"

Trying to understand whether handling this edge case worth an additional trigger in the execution-layer fork choice. Suppose, there is a beacon block B and it has two children C and C', C is the valid transition block, i.e. it contains a payload built on top of the valid terminal PoW block and it's valid wrt beacon chain state transition, C' does not have any payload hence is non-transition block but it's a valid beacon chain block after all. And the network is in favour of C' eventually.

For simplicity, let's assume that a halve of nodes has observed C prior to C', and another halve has observed C' as the first block in this pair, therefore, a halve of nodes in the network has switched their execution engines to PoS and designated terminal PoW block T as the tip of their PoW chain (the chain produced by the PoW consensus mechanism). Suppose that a halve of the switched nodes has terminal PoW block T' as the tip of the PoW chain according to the TD fork choice rule. And if there is a proposer from this halve of the switched nodes then we want to let it propose a transition block with a payload built on top of T' instead of being forced to build a payload on top of T.

Under the assumption that T and T' blocks are both valid terminal PoW blocks, do we really care that much that a proposer could be forced by the other network participant to favour one terminal PoW block over the other in not that likely case? The case when the proposer of C intentionally decided to avoid the transition or having its execution engine unaware of the terminal PoW block at least 12 seconds after the other proposer has seen a terminal block (due to PoW network partitioning maybe). I would also add another argument here, T and T' are likely gonna have the same difficulty value hence same TD and if they have the same parent (which is also likely), then the TD fork choice would favour the one which has been observed first and is a matter of chance hence choosing between two valid terminal blocks is likely a matter of chance as well.

In an exceptional case we could have several non-transition blocks built on top of C. In the worst case it would give an opportunity to build a valid terminal PoW block with the difficulty ~5% less than its sisters have, it would require about 900 seconds of a time without transition (75 non-transition blocks on the beacon chain). Considering potential difficulty drop and the chance of having a row of non-transition blocks, I don't think that this is a big surface for minority fork attack.

Considering the above (I hope my analysis does make sense) the options are as follows:

  1. Rename pow_chain to block_chain, and say that the block_chain abstractly represents all blocks in the canonical chain disregarding the consensus mechanism that has built each of these blocks
  2. Do care about the edge case. And if we take this option, then IMO a better way to do that would be to have an explicit Engine API method, like engine_getTerminalPowBlock() that would use the TD fork choice to fine the terminal PoW block and return it back to the consensus-layer, as suggested by @paulhauner. We should consider the following, how easy for consensus-layer client it would be to catch up with the change of the pow_chain If the switch back to the TD fork choice that affects the pow_chain has happened.

I agree to rename get_pow_block and PowBlock to something like get_pow_attributes and PowAttributes to avoid the confusion.

@mkalinin
Copy link
Collaborator

This is likely tangential, but we'll need to consider mechanisms on the APIs to lock to 1:1 if that's how we're going to specify.

What do you mean the "mechanisms"? Could we explicitly say that Engine API spec we have is designed to handle 1:1 case and not every part of it might work in many-to-one relationship?

A rough idea of handling many-to-one use case is outlined in the Shared execution client section of the Engine API design space doc. The exercise was to consider this use case in the API design and if possible avoid decisions in the design that could restrict the many-to-one use case and it seems that the fork choice state concurrently updated by multiple consensus clients is a problem. I haven't thought much of it but I'd prefer implementations decide on how to handle the many-to-one use case as long as the API allows for it. I'd be happy to discuss many-to-one use case in a separate thread.

@hwwhww
Copy link
Contributor

hwwhww commented Oct 1, 2021

Off the main issue of this PR, should we clarify the error handling of this function? i.e., if block with given block_hash is unavailable.

Option 1.

  • Allow get_pow_block to return None. Change it to get_pow_block(block_hash: Hash32) -> Optional[PowBlock].
  • In on_block, if get_pow_block result in None, raise exception.

Option 2.

  • Just add a note in get_pow_block: "If the block is not found, the operation is considered invalid."

@hwwhww hwwhww added the Bellatrix CL+EL Merge label Oct 1, 2021
@mkalinin
Copy link
Collaborator

mkalinin commented Oct 4, 2021

Off the main issue of this PR, should we clarify the error handling of this function? i.e., if block with given block_hash is unavailable.

Option 1.

  • Allow get_pow_block to return None. Change it to get_pow_block(block_hash: Hash32) -> Optional[PowBlock].
  • In on_block, if get_pow_block result in None, raise exception.

Option 2.

  • Just add a note in get_pow_block: "If the block is not found, the operation is considered invalid."

Actually, with this change #2595 get_pow_block should never request attributes of an unknown block. There could be an edge case when for some reason the execution client processes the payload and then refuses to return its parent. So, we may change the response type to Optional[PowBlock] and add an assertion to the spec that ensures the block data has been returned.

@mkalinin
Copy link
Collaborator

mkalinin commented Oct 6, 2021

Actually, I am not quite right on the above and there is a normal case which get_pow_block would return nothing in. This is when EL is syncing and terminal block hits the network. The spec must be updated to handle such a case.

@hwwhww
Copy link
Contributor

hwwhww commented Oct 19, 2021

@mkalinin

Rename pow_chain to block_chain, and say that the block_chain abstractly represents all blocks in the canonical chain disregarding the consensus mechanism that has built each of these blocks

A nitpicking question around get_pow_block in the validator guide:

In get_pow_block_at_terminal_total_difficulty:

def get_pow_block_at_terminal_total_difficulty(pow_chain: Sequence[PowBlock]) -> Optional[PowBlock]:
    # `pow_chain` abstractly represents all blocks in the PoW chain
    for block in pow_chain:
        parent = get_pow_block(block.parent_hash)
        ...

If the validator already maintains all blocks, do they really have to call get_pow_block(block.parent_hash) (which we implied it's a JSON-RPC call) to get the parent block?

Can we just query in the given pow_chain? for example:

query = [pow_block for pow_block in pow_chain if pow_block.block_hash == block.parent_hash]
parent = query[0] if query else None

It seems we touched the line between pseudo-code and reality. Reusing get_pow_block is more convenient but it is a bit confusing when I read it.

@mkalinin
Copy link
Collaborator

I see what you mean. Re-using pow_chain is probably the right way of doing this query. Could we iterate with index starting from i = 1 and get block and its parent by index: block = pow_chain[i], parent = pow_chain[i - 1]?

@hwwhww
Copy link
Contributor

hwwhww commented Oct 20, 2021

@mkalinin
That works too if we guaranteed pow_chain is ordered.

Another option is creating another abstract helper get_pow_block_from_chain(pow_chain: Sequence[PowBlock], block_hash: Hash32) -> PowBlock so that clients can implement it with more efficient data structure.

@mkalinin
Copy link
Collaborator

@mkalinin That works too if we guaranteed pow_chain is ordered.

Another option is creating another abstract helper get_pow_block_from_chain(pow_chain: Sequence[PowBlock], block_hash: Hash32) -> PowBlock so that clients can implement it with more efficient data structure.

I think client implementations override get_pow_block_at_terminal_total_difficulty function as iterating over the whole chain is suboptimal. For simplicity, we could also use Dict[Hash32, PowBlock] instead of Sequence[PowBlock] to be able to request by hash, requirement on ordering pow_chain by the block height would be a bit odd to me as PowBlock doesn't have the number field which the order is applied to.

@djrtwo
Copy link
Contributor

djrtwo commented Oct 29, 2021

addressed in #2694

@djrtwo djrtwo closed this as completed Oct 29, 2021
@hwwhww
Copy link
Contributor

hwwhww commented Nov 2, 2021

Reopen it because I think the original problem of this issue hasn't been solved yet?

@mkalinin's comment: #2636 (comment)

@mkalinin
Copy link
Collaborator

mkalinin commented Nov 4, 2021

Thanks @hwwhww! Getting back to the original problem. In the event when a valid transition block isn't accepted by the network, do we want a subsequent proposer to pick a transition PoW block according to the total difficulty rule. Or are we OK that the proposer will pick the same transition PoW block as the previous proposer picked if it has imported the previous transition block by a chance and switched its fork choice rule to the PoS?

@mkalinin
Copy link
Collaborator

mkalinin commented Dec 8, 2021

Closing this issue without any follow-ups to the spec. Switching fork choice rule back and force doesn't seem to be valuable enough to take the implementation and design complexity it introduces.

The only real case that is affected by not doing the switch back to TD rule seems to be the case when the valid transition block isn't accepted by the network and the validator that has accepted this block is proposing yet another transition block. The choice of a terminal block in this pretty rare case will happen according to the new fork choice rule rather than to the TD rule which doesn't seem vulnerable to any kind of attack as long as a terminal block satisfies the corresponding conditions. The latter must always be true, otherwise, the previous transition block must have not been accepted by the node.

@mkalinin mkalinin closed this as completed Dec 8, 2021
bors bot pushed a commit to sigp/lighthouse that referenced this issue Aug 10, 2022
## Issue Addressed

NA

## Proposed Changes

Removes three types of TODOs:

1. `execution_layer/src/lib.rs`: It was [determined](ethereum/consensus-specs#2636 (comment)) that there is no action required here.
2. `beacon_processor/worker/gossip_methods.rs`: Removed TODOs relating to peer scoring that have already been addressed via `epe.penalize_peer()`.
    - It seems `cargo fmt` wanted to adjust some things here as well 🤷 
3. `proto_array_fork_choice.rs`: it would be nice to remove that useless `bool` for cleanliness, but I don't think it's something we need to do and the TODO just makes things look messier IMO.


## Additional Info

There should be no functional changes to the code in this PR.

There are still some TODOs lingering, those ones require actual changes or more thought.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bellatrix CL+EL Merge
Projects
None yet
Development

No branches or pull requests

4 participants