Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-0046: Optimistic cluster restart automation #46

Open
wants to merge 103 commits into
base: main
Choose a base branch
from

Conversation

wen-coding
Copy link

No description provided.

@wen-coding wen-coding marked this pull request as draft April 10, 2023 16:06
@wen-coding wen-coding marked this pull request as ready for review April 10, 2023 16:06
@wen-coding wen-coding marked this pull request as draft April 10, 2023 16:07
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved

So after a validator sees that 75% of the validators received 75% of the votes,
wait for 10 more minutes so that the message it sent out have propagated, then
restart from the Heaviest slot everyone agreed on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our last call I was thinking once each validator has figured out the heaviest fork and repaired up to the highest oc slot, the validator would:

  1. Issue a "hard fork" at the highest oc slot, which also changes the gossip shred version
  2. Execute the existing "--wait-for-supermajority" logic (ie, purge all slots above the highest oc slot, wait for 80%)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. I think we probably. should wait for 75% here because we assume 5% could be non-conforming.

proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved

We calculate "enough" stake as follows. When there are 80% validators joining
the restart, assuming 5% restarted validators can make mistakes in voting, any
block with more than 67% - 5% - (100-80)% = 42% could potentially be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this calculation.
What if the other 100% - 42% = 58% pick some other block?
Why should the minority 42% block be optimistically confirmed?

Why should ever a block with less than 67% vote be optimistically confirmed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to prevent false negative (if a slot was oc'ed before the restart, you must pick it here), not to prevent false positive (it's okay if we pick a slot here which isn't oc'ed). Because when we select Heaviest later we should see the competing fork and count the votes accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prolly add the motivation and justification for these values to the document

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Comment on lines 119 to 122
2.1 If vote_on_child + stake_on_validators_not_in_restart >= 62%, pick child.
For example, if 80% validators are in restart, child has 42% votes, then
42 + (100-80) = 62%, pick child. 62% is chosen instead of 67% because 5%
could make the wrong votes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, I am not sure why it is safe to go below 67%?!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to prevent false negative at all costs and it's okay to have false positive. Let's say X is the first block having only 62% but not 67%, we know if 75% of the validators decide to pick this fork, it will be instantly oc'ed and we won't kick another oc'ed slot out. Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, add the motivation and justification in the doc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

wen-coding and others added 2 commits May 1, 2023 11:02
@mvines mvines marked this pull request as ready for review May 2, 2023 17:05
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
proposals/0024-repair-and-restart.md Outdated Show resolved Hide resolved
sender's last voted fork. the most significant bit is always
`last_voted_slot`, least significant bit is `last_voted_slot-81000`.

The number of ancestor slots sent is hard coded at 81000, because that's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick justification for 81000 would be nice, to help the reader with context. I don't recall myself either! But I assume we're trying to limit the max size of the crds message?

Copy link
Author

@wen-coding wen-coding Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that we set it at 9 hours because we assumed that most validator admins would wake up and join the choir in 9 hours.

In reality if the cluster gets stuck, new blocks would be confirmed and voted on very slowly, so very likely 81k slots is overkill.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added motivation to the description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So there's no underlying reason why 81k couldn't be larger beyond this? Like if we said 24 hours worth of slots is the very worst case, is there any downside to doing so if in reality all future restarts are handled in less than 9 hours?

Copy link
Author

@wen-coding wen-coding Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main downside is that pulling from blockstore is slower and distributed RestartLastVotedForkSlots is larger. Right now when a validator receives RestartLastVotedForkSlots it will just discard anything older than its local root, because if its root is not on the RestartLastVotedForkSLots it can't participate in wen_restart anyway. And repairing anything before root slot doesn't make sense. So making this number larger mostly just wastes network bandwidth.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, 81k is already larger than one UDP packet, so it needs to be chopped into chunks. So 24 hours is not much worse, just more packets.

mvines
mvines previously approved these changes Aug 14, 2023
Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest iteration reads well for me, and feels well-specified enough. Love the name of the new restart protocol! Wen implementation?

@jacobcreech
Copy link
Contributor

cc @ripatel-fd looks like this SIMD is getting close to consensus with some approval from the Labs-side. Could ya'll on Firedancer take a look at it?

@jacobcreech jacobcreech added standard SIMD in the Standard category core Standard SIMD with type Core labels Aug 16, 2023
@0xOsprey
Copy link

Can I pay (or donate to the charity of the contributors choice) for someone to get this merged?

@lheeger-jump lheeger-jump changed the title Optimistic cluster restart automation SIMD-0046: Optimistic cluster restart automation May 21, 2024
Comment on lines 203 to 207
We set the line at 42% when 80% join the restart, it's possible that
different validators see different 80%, so their must-have blocks might
be different, but in reality this case should be rare. Whenever some block
gets to 42%, repair could be started, because when more validators join the
restart, this number will only go up but will never go down.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend explaining where 42% comes from here to before this, so it doesn't appear as a magic number

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Comment on lines +232 to +234
For example, if 80% validators are in restart, the number would be
`67% - 5% - (100-80)% = 42%`. If 90% validators are in restart, the number
would be `67% - 5% - (100-90)% = 52%`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, recommend moving the explanation of 5% before this number appears

Copy link
Author

@wen-coding wen-coding May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explanation on line 158.

Comment on lines +342 to +346
The two added gossip messages `RestartLastVotedForkSlots` and
`RestartHeaviestFork` will only be sent and processed when the validator is
restarted in the new proposed optimistic `cluster restart` mode. They will also
be filtered out if a validator is not in this mode. So random validator\
restarting in the new mode will not bring extra burden to the system.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could there be more explicit discussion of the "optimistic" aspect of this such that it won't violate any safety properties?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added on line 80.

combined stake would be `2 * (67% - 5% - X)`. Because we only allow one
RestartHeaviestFork per pubkey, even if the any validator can put both
children in its RestartHeaviestFork, the children's total stake should be
less than `100% - 5% - X`. We can caculate that if `134% - 2 * X < 95% - X`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: calculate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Even if the last block D on the list may not be optimistically confirmed,
it already has at least `42% - 5% = 37%` stake, with no competing sibling
E getting more than `42%` stake. This is equal to the case where `5%` stake
jumped ship from fork E to fork D, 80% of the cluster can switch to fork E

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend precise terminology "switch proof" instead of "jumped ship"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Even if the last block D on the list may not be optimistically confirmed,
it already has at least `42% - 5% = 37%` stake, with no competing sibling
E getting more than `42%` stake. This is equal to the case where `5%` stake

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a different letter? E was used for the parent above already

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, changed.

it already has at least `42% - 5% = 37%` stake, with no competing sibling
E getting more than `42%` stake. This is equal to the case where `5%` stake
jumped ship from fork E to fork D, 80% of the cluster can switch to fork E
if that turns out to be the heavist fork.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: heaviest

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

on the list:

Assume block A is one such block, it would have `67%` stake, discounting
`5%` non-conforming and people not participating in wen_restart, it should

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clearer to call this dishonest nodes, who will vote on blocks they do not actually have

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because validators can misbehave either due to being malicious or due to software bugs, we want to use non-conforming which doesn't automatically imply people are behaving poorly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Standard SIMD with type Core standard SIMD in the Standard category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants