-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup ReplayStage confirmation scaffolding for duplicate slots #9698
Conversation
7fd3b29
to
50204e9
Compare
Codecov Report
@@ Coverage Diff @@
## master #9698 +/- ##
=========================================
+ Coverage 79.9% 80.0% +0.1%
=========================================
Files 409 410 +1
Lines 107768 108588 +820
=========================================
+ Hits 86179 86950 +771
- Misses 21589 21638 +49 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
@carllin - sup with this PR? Is it ded? |
@mvines it's on hold because I haven't had time to get it in yet 😢 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This stale pull request has been automatically closed. Thank you for your contributions. |
5dee535
to
9a2c6a2
Compare
369f099
to
888a077
Compare
yup, will update the existing docs soon |
Account for some switch check edge cases
…le duplicate/confirmed/freeze/dead cases
…on same fork valid, TODO: tests
…ad of supermajority
(cherry picked from commit 52703ba)
(cherry picked from commit 52703ba)
(cherry picked from commit 52703ba)
(cherry picked from commit 52703ba)
Problem
Duplicate versions of a slot potentially partition and stall the cluster
Summary of Changes
Here's how I imagine the full fix working:
What's in this PR (goal is to prevent voting on duplicate slots asap):
a) If the duplicate fork is already confirmed, we (eventually) signal the cluster here which version of this slot we have: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-5984b6b0429f857c13a2a362669ab37cR543. This is because only 1 fork should be confirmed unless > 1/3 of the cluster is malicious.
b) If ReplayStage processes and marks those duplicates/entire fork descended from those duplicate slots here: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-5984b6b0429f857c13a2a362669ab37cR559-R563
In 1b) after a fork has been marked duplicate, we remove it as a candidate for voting here:
https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-5984b6b0429f857c13a2a362669ab37cR1326-R1331. However, if you've already voted on this fork, and you can't generate a switching prof, then you will continue to generate banks on this fork
to avoid liveness issues (more details here: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-5984b6b0429f857c13a2a362669ab37cR1271-R1278)
If a duplicate slot is confirmed, the entire fork is added back into the candidate set by clearing the duplicate flag here: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-fb925e9cb1a6c2044ceaa55aa7c8f255R393
What's in follow-up PR:
Need to detect if cluster has confirmed some alternate version
V
of a duplicate slot (Imagine a validator has a dead slot, or a valid, playable version of a slot, but rest of the cluster confirmed a different slot). Can this be done by having another version of EpochSlots, but for confirmed slots instead of completed slots?If you see a slot is confirmed by supermajority in 1) and your version of the slot is dead or unconfirmed for some expiration time, then dump your version of the slot and download another version from a trusted validator OR random stake weighted validator who claims they have a confirmed version of the block. For v2 we can ask for a proof with an RSA accumulator that this version of the slot is the one included in a future confirmed block
When a confirmed version of a duplicate slot with hash
V
is found, and it's not the same as your currently played version:a) Clear the currently played version from
i) status cache: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-92c739d9ad61135b886d1a44957fe485R83
ii) and progress map in replay stage: https://github.com/solana-labs/solana/compare/master...carllin:FixReplayStage?expand=1#diff-5984b6b0429f857c13a2a362669ab37cR486
b) Set the confirmed blockhash in blockstore.
Fixes #