New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scp timeout short circuit v blocking fix #1935

Merged
merged 7 commits into from Feb 5, 2019

Conversation

Projects
5 participants
@MonsieurNicolas
Copy link
Contributor

MonsieurNicolas commented Jan 30, 2019

This PR supersedes #1786

It adds a few comments and tests to cover updating values of (p,p').
The actual code change is not necessary because of the logic at

if (areBallotsLessAndCompatible(ballot, *mPrepared))
{
continue;
(also current code would crash if we ever hit that case because we check in SCP invariants that p and p' are not compatible) - but out of clarity and paranoia, it doesn't really hurt to add the condition explicitly.

While working on this, I noticed an actual bug, fixed in the "short-circuiting logic..." commit:
this relates to the rule (9) in the SCP white paper where we should be evaluating any potential counter (greater than the current counter), to see if there is no v-blocking set that is passed that counter.

The previous code was bailing out if there were any validator on a smaller counter than the current counter.
Effect was that the node would stay longer in the current state, increasing consensus time, relying on timeout instead of observation from messages from other nodes.

@MonsieurNicolas MonsieurNicolas added this to In progress in v10.2.0 via automation Jan 30, 2019

@MonsieurNicolas MonsieurNicolas changed the title Scp timeout v blocking fix Scp timeout short circuit v blocking fix Jan 30, 2019

@graydon
Copy link
Contributor

graydon left a comment

So I think I agree with @jonjove that the effect of this change is (semantically) just to change the early break in the second loop of attemptBump from break to a continue. It took me a while to convince myself this is enough to implement the requirements of the paper / internet-draft, given the boundary conditions and the way the conditions are coded though.

In the process of deciding it was correct, I refactored and commented it a fair bit; if you'd like to pull in that change too, I've posted it here:

graydon@1aa9bc8

If you like I can file it as a subsequent cleanup PR later.


if (!mPreparedPrime ||
((compareBallots(*mPreparedPrime, ballot) < 0) &&
!areBallotsCompatible(*mPrepared, ballot)))

This comment has been minimized.

@graydon

graydon Jan 31, 2019

Contributor

If this 3rd condition is over-paranoid rather than a condition we believe could ever occur, should it be an assertion?

This comment has been minimized.

@MonsieurNicolas

MonsieurNicolas Jan 31, 2019

Author Contributor

there is already an assert as it would violate SCP invariants

@MonsieurNicolas

This comment has been minimized.

Copy link
Contributor Author

MonsieurNicolas commented Feb 1, 2019

cool, I added @graydon 's cleanup commit that makes it better. @jonjove r+ when you get a chance

allCounters.insert(eCounter);
}
uint32_t c = statementBallotCounter(e.second.statement);
if (c >= localCounter)

This comment has been minimized.

@jonjove

jonjove Feb 1, 2019

Contributor

I think this should be strict inequality. By the time this line is reached, we have already determined (line 1392) that hasVBlockingSubsetStrictlyAheadOf(localNode, mLatestEnvelopes, localCounter) == true. Therefore the condition on line 1413 cannot be satisfied for n = localCounter, so it may as well be excluded from allCounters.

This comment has been minimized.

@MonsieurNicolas

MonsieurNicolas Feb 5, 2019

Author Contributor

yes agree, fixed

// there should at worst have been some _maximal_ counter in the set to
// advance to, that by definition no v-blocking subset was "strictly
// ahead of".
assert(allCounters.empty());

This comment has been minimized.

@jonjove

jonjove Feb 1, 2019

Contributor

I don't think this is actually reachable. Suppose that there exists a v-blocking set strictly ahead of localCounter. Then there exists at least one element in mLatestEnvelopes with counter c > localCounter, because the empty set is not a v-blocking set. Therefore allCounters contains a counter c > localCounter; specifically allCounters is not empty. Now fix the highest counter C in allCounters. If there does not exist a v-blocking set strictly ahead of C, then this line cannot be reached and we are done. Suppose instead that there does exist a v-blocking set strictly ahead of C. Then, as before, there exists at least one element in mLatestEnvelopes with counter c > C, because the empty set is not a v-blocking set. But this implies that allCounters includes c > C, which contradicts that C is the highest counter in allCounters.

This comment has been minimized.

@MonsieurNicolas

MonsieurNicolas Feb 5, 2019

Author Contributor

I agree, this comment and assert are confusing, I am removing them. I could have put an assert(false) here but this doesn't seem to meet the bar for crashing consensus

@MonsieurNicolas MonsieurNicolas force-pushed the MonsieurNicolas:scpTimeoutVBlockingFix branch from 8e97998 to b943785 Feb 5, 2019

@jonjove

This comment has been minimized.

Copy link
Contributor

jonjove commented Feb 5, 2019

r+ b943785

@latobarita

This comment has been minimized.

Copy link
Contributor

latobarita commented on b943785 Feb 5, 2019

saw approval from jonjove
at MonsieurNicolas@b943785

This comment has been minimized.

Copy link
Contributor

latobarita replied Feb 5, 2019

merging MonsieurNicolas/stellar-core/scpTimeoutVBlockingFix = b943785 into auto

This comment has been minimized.

Copy link
Contributor

latobarita replied Feb 5, 2019

MonsieurNicolas/stellar-core/scpTimeoutVBlockingFix = b943785 merged ok, testing candidate = 3f48dbc

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

latobarita replied Feb 5, 2019

fast-forwarding master to auto = 3f48dbc

latobarita added a commit that referenced this pull request Feb 5, 2019

Merge pull request #1935 from MonsieurNicolas/scpTimeoutVBlockingFix
Scp timeout short circuit v blocking fix

Reviewed-by: jonjove

@latobarita latobarita merged commit b943785 into stellar:master Feb 5, 2019

1 check passed

default all tests passed
Details

v10.2.0 automation moved this from In progress to Done Feb 5, 2019

@choza13

This comment has been minimized.

Copy link

choza13 commented Feb 5, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment