Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BestFinalized method #7619

Merged
merged 4 commits into from Oct 23, 2020
Merged

Conversation

farazdagi
Copy link
Contributor

@farazdagi farazdagi commented Oct 23, 2020

What type of PR is this?

Bug fix

What does this PR do? Why is it needed?

  • When writing tests for init-sync, interesting bug was discovered (introduced in Fix 2nd Stage of Sync #7006):
    • When peers are selected, peers with higher epochs are preferred (i.e. more knowledgeable peers are preferred).
    • During Medalla incident we wanted to make sure that head slot reported by peers also affects sorting.
    • Sorting wasn't implemented correctly: peers having the same head slot but different epoch were compared using
pidEpoch[potentialPIDs[i]] > pidEpoch[potentialPIDs[j]] && pidHead[potentialPIDs[i]] > pidHead[potentialPIDs[j]]

Which is always false in that case i.e. peer epoch is being ignored, producing almost random data (depending on order items are fed into slice.Sort, which is arbitrary).

  • What makes it a bit worse: if peer with lower epoch has higher head (happens quite often in period of non-finality, especially), another peer with higher epoch will not be ranked higher (as that conditional above will be false).

Which issues(s) does this PR fix?

N/A

Other notes for review

  • Since we have a separate BestNonFinalized() function (where sorting is done by head slot), which is used in init-sync for syncing slots coming after the finalized checkpoint, BestFinalized() should always use epoch as the main factor when selecting peers (head slots only adding to that).
  • I've rewritten test, to rely on test vector:
    • Preserved the original test case
    • Added regression test
    • Added couple of more tests that might be interesting (including checking that max limit of returned peers is enforced indeed - that was untested)

@farazdagi farazdagi added the Bug Something isn't working label Oct 23, 2020
@farazdagi farazdagi requested a review from a team as a code owner October 23, 2020 01:54
@farazdagi farazdagi self-assigned this Oct 23, 2020
@farazdagi farazdagi added the Ready For Review A pull request ready for code review label Oct 23, 2020
@prylabs-bulldozer prylabs-bulldozer bot merged commit 6a2bb65 into master Oct 23, 2020
@delete-merged-branch delete-merged-branch bot deleted the fix-best-finalized-method branch October 23, 2020 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants