Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New finalizer with multi parent support #3388

Merged
merged 2 commits into from
Jun 11, 2021

Conversation

nzpr
Copy link
Collaborator

@nzpr nzpr commented Apr 25, 2021

This is replacement for #3366 as it has been closed by bors.

Old finalizer is not suitable for multi parent as it is possible that last finalized block is stuck. This happen due to the fact that only children that have current LFB as a main parent are subject to be next LFB.

Finalizer introduced in this PR seeks for LFB only using tips and DAG and input data, without involving requirement to know previous LFB. Due to this change, algorithm for finalization is rewritten significantly, keeping the main logic of following CliqueOracle output do decide if some message is finalized .

@nzpr nzpr force-pushed the new-finalizer branch 2 times, most recently from 29e61e9 to 9c07529 Compare April 25, 2021 13:35
@nzpr nzpr changed the base branch from feature/leaderless to feature/deploymerge May 17, 2021 17:02
@nzpr nzpr requested review from zsluedem and tgrospic May 17, 2021 19:30
@nzpr nzpr changed the base branch from feature/deploymerge to feature/leaderless May 18, 2021 04:51
@nzpr nzpr added this to the Enable external validators milestone May 18, 2021
@nzpr nzpr changed the base branch from feature/leaderless to dev May 18, 2021 06:40
@nzpr nzpr changed the base branch from dev to feature/leaderless May 18, 2021 06:40
@nzpr nzpr added the feature-branch Development parallel to dev branch label May 18, 2021
@nzpr nzpr removed this from the Enable external validators milestone May 18, 2021
@nzpr nzpr mentioned this pull request May 19, 2021
@nzpr nzpr changed the base branch from feature/leaderless to feature/deploymerge May 19, 2021 11:17
@nzpr nzpr force-pushed the new-finalizer branch 6 times, most recently from 6fe19f6 to 7cd30cb Compare May 28, 2021 04:58
@tgrospic tgrospic added this to the Block merge milestone May 28, 2021
@nzpr nzpr force-pushed the new-finalizer branch 9 times, most recently from 3929626 to c38d66a Compare June 9, 2021 06:05
@nzpr nzpr requested a review from tgrospic June 9, 2021 07:11
@nzpr nzpr force-pushed the new-finalizer branch 5 times, most recently from 964f6d5 to 6cdd205 Compare June 9, 2021 18:41
newLfbEffect(lfb.blockHash).as(lfb.blockHash) <*
dag
.withAncestors(lfb.blockHash, dag.isFinalized(_).not)
.flatMap(_.toList.traverse(finalisationEffect))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't have atomic operation on LMDB, here is possible for node to be killed between saving directly finalized block newLfbEffect and related operations in finalisationEffect.
Is it better to save LFB at the end? Can this be more safe so that finalisationEffect can be executed again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, indeed, otherwise if node is killed finalisationEffect might be never called for some messages that are ancestor of LFB. I'l push fix in commit addressing comments

// in theory if each stake is high enough, e.g. Long.MaxValue, sum of them might result in negative value
require(activeStakeTotal > 0, message = "Long overflow when computing total stake")
require(activeStakeAgreeing > 0, message = "Long overflow when computing total stake")
activeStakeAgreeing > activeStakeTotal.toFloat / 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this calculation here in finalizer? I don't see that was in previous calculation. This should be information from safety oracle.

Copy link
Collaborator Author

@nzpr nzpr Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because previous finalizer were just running clique oracle on all LFB children. This particular logic were not on that version. New version traverses down from tips until it finds messages worth to be computed faultTolerance agains. If you see clique oracle code - if less than half of a stake is agreed on message, max clique is not computed for this message, and minimal fault tolerance is returned straight away. So this message for sure cannot be finalized.

Do you mean this function better to be moved inside CliqueOracle?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is if we are sure here that agreeing validators cannot have equivocations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this. There is always only one main parent, right? So agreeing chains are well defined. We need to get to the point where enough of these chains join, and find first message that pass fault toleranceThreshold.

@tgrospic tgrospic linked an issue Jun 11, 2021 that may be closed by this pull request
Copy link
Collaborator

@tgrospic tgrospic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nzpr nzpr merged commit ebbe3dd into rchain:feature/deploymerge Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix enhancement feature-branch Development parallel to dev branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modify the DAG representation to support block merge
3 participants