Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: eth1: Fix earliest block of interest for eth1 voting #4588

Closed

Conversation

pietjepuk2
Copy link

@pietjepuk2 pietjepuk2 commented Feb 3, 2023

This bug popped up on Zhejiang, where (and when) the follow distance was very small at 12 compared to the voting period length of 2048.

From what I can tell what was going on is that the earliestBlockOfInterest was way too high (not even before the voting period start). When it was time for Nimbus to propose, latestCandidateBlock then didn't have any valid candidates to choose from.

This commit fixes it, but does make it such that on mainnet 2048 more blocks will be cached in the worst cast. At least, I assume that's what syncBlockRange does, build a cache. Although I do wonder if blocks outside this range are indeed evicted, and when (restart? every N minutes?).

  • Check with devs if above understanding is correct
  • Update / amend commit message

@pietjepuk2 pietjepuk2 changed the title eth1: Fix earliest block of interest for eth1 voting WIP: eth1: Fix earliest block of interest for eth1 voting Feb 3, 2023
@github-actions
Copy link

Unit Test Results

         9 files  ±0    1 062 suites  ±0   30m 25s ⏱️ - 1m 40s
  3 552 tests ±0    3 315 ✔️ ±0  237 💤 ±0  0 ±0 
15 401 runs  ±0  15 136 ✔️ ±0  265 💤 ±0  0 ±0 

Results for commit 998796a. ± Comparison against base commit 956aee2.

@zah
Copy link
Contributor

zah commented Mar 9, 2023

This fix doesn't look correct to me. Ultimately, the problem is in the is_candidate_block function which is defined in the spec as follows:

def is_candidate_block(block: Eth1Block, period_start: uint64) -> bool:
    return (
        block.timestamp + SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE <= period_start
        and block.timestamp + SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE * 2 >= period_start
    )

Thus, the set of candidate blocks is not defined by the voting period, but rather by the follow distance. In every network where ETH1_FOLLOW_DISTANCE is very different from EPOCH_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH, there will be risk that blocks that were voted during the period (e.g. the majority vote) may drop out of the candidate set from the perspective of a honest validator.

This is most likely an accidental oversight in the spec because in all existing networks EPOCH_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH happens to be equal to ETH1_FOLLOW_DISTANCE (including Sepolia right now). The proper fix is to redefine is_candidate_block in terms of SLOTS_PER_ETH1_VOTING_PERIOD in the spec. Also, the SECONDS_PER_ETH1_BLOCK constant should be removed in favor of the regular SECONDS_PER_SLOT constant (because they don't match after the merge).

@etan-status
Copy link
Contributor

Note that there is https://eips.ethereum.org/EIPS/eip-6110 which may make this obsolete.

zah added a commit that referenced this pull request Mar 10, 2023
@zah
Copy link
Contributor

zah commented Mar 10, 2023

After some additional offline conversations with @pietjepuk2, it became clear that this fix remains useful and it was merged it in f91fe56. Thank you very much for this in-depth investigation.

@zah zah closed this Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants