Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple epoch transitions per epoch #3276

Closed
twoeths opened this issue Oct 1, 2021 · 8 comments · Fixed by #3532
Closed

Multiple epoch transitions per epoch #3276

twoeths opened this issue Oct 1, 2021 · 8 comments · Fixed by #3532
Assignees
Labels
scope-performance Performance issue and ideas to improve performance.

Comments

@twoeths
Copy link
Contributor

twoeths commented Oct 1, 2021

Describe the bug

Right now for each epoch, we have 3 or 4 or 5 epoch transitions

Screen Shot 2021-10-01 at 17 22 08

The checkpoint state cache shows that we have no more than 3 states per epoch

Screen Shot 2021-10-01 at 17 36 47

Expected behavior

  1. There are 1 or 2 redundant epoch transitions per epoch as we can simply get it from the cache
  2. Investigate why we have up to 3 epoch transitions per epoch? Ideally most of the time we should have exactly 1 epoch transition per epoch.

At least when getting attester duties, we may not need to rotate to the current epoch (most of the time) thanks to the nextEpochShuffling cache

const state = await chain.getHeadStateAtCurrentEpoch();

@dapplion
Copy link
Contributor

dapplion commented Oct 2, 2021

@tuyennhv The first chart screenshot can be very miss-leading, three dots don't mean 3 epoch transitions. You should look into the actual count of epoch transitions and look at the step delta.

@dapplion
Copy link
Contributor

dapplion commented Oct 2, 2021

Screenshot from 2021-10-02 05-48-06

This is Contabo-5 and here it's irrefutable that is most of the epochs 2 epoch transitions are run

@twoeths
Copy link
Contributor Author

twoeths commented Oct 2, 2021

@dapplion do you think 2 epoch transitions per epoch is good enough? In my understanding, it should be 1 most of the time

@dapplion
Copy link
Contributor

dapplion commented Oct 2, 2021

@dapplion do you think 2 epoch transitions per epoch is good enough? In my understanding, it should be 1 most of the time

No, 2 is very bad! hahaha It would be very strange that there a re-org always at the epoch boundary, we must be doing something wrong

@twoeths
Copy link
Contributor Author

twoeths commented Oct 5, 2021

I was able to catch the additional non-regular epoch transition from

const targetState = await chain.regen

when we validate a gossip AggregateAndProof that comes from probably an out-of-sync node.

  • our head is at slot 1408418, epoch 44013
  • attSlot = 1408418
  • attTarget is at epoch 44013 too but attTargetRoot is 0xe05fd6d8353f56da988f171ed7e12dde3199afadc47c891517f8e5da995e9014 which is block 1408386 (3rd block of epoch 44012) => this node is around 32 slots away from the network head

then we use regen to roll to the current epoch (which requires an epoch transition) in order to:

  • get committee indices
  • get pubkey of aggregator
  • get domains to verify signature
  • get pubkeys of attesters

in this exact example, we could achieve the same result by just using the state at block 1408386 (head of the producer node) since we know it's 1 epoch earlier and we have the nextShuffling cache

@dapplion
Copy link
Contributor

dapplion commented Oct 5, 2021

Oh good find @tuyennhv ! I've been rethinking our regen strategy and have my rough thoughts here.

Could you take a look at this proposal and check if it would solve the issue? https://github.com/ChainSafe/lodestar/blob/dapplion%2Fregen-blackbox/packages/lodestar/src/chain/regen/interface.ts

@twoeths
Copy link
Contributor Author

twoeths commented Oct 5, 2021

@twoeths
Copy link
Contributor Author

twoeths commented Oct 5, 2021

Oh good find @tuyennhv ! I've been rethinking our regen strategy and have my rough thoughts here.

Could you take a look at this proposal and check if it would solve the issue? https://github.com/ChainSafe/lodestar/blob/dapplion%2Fregen-blackbox/packages/lodestar/src/chain/regen/interface.ts

@dapplion yeah using await regen.getAttesterShuffling(targetCheckpoint); would work nicely, no epoch transition required there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope-performance Performance issue and ideas to improve performance.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants