Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Add grace period before starting block sync #3249

Merged
merged 1 commit into from
May 10, 2024

Conversation

raychu86
Copy link
Contributor

@raychu86 raychu86 commented May 8, 2024

Motivation

This PR simply adds an additional grace period for the nodes to wait before attempting to call try_block_sync.

Ideally, try_block_sync isn't called until the first PrimaryPing message is received, however in the initial bootup of validators, the nodes need to consider themselves synced to send the PrimaryPing in the first place. Hence, the heuristic wait approach.

This is needed because we've observed that during bootup, nodes were (occasionally) incorrectly setting themselves as synced and when they were in fact much further behind than the other validators. This caused them to incorrectly start processing proposals and certificates from the other validators instead of actually syncing.

Here is an example of an observed sequence of logs that should not happen:

DEBUG Skipping batch proposal (node is syncing)
INFO No connected validators
DEBUG Fetched {} missing previous certificates for round {} from '{}'
DEBUG Primary is not ready to propose the next round
DEBUG Primary is safely skipping a batch proposal (please connect to more validators)

Upon further debugging, this seems to happen when the validator node call try_block_sync before receiving block locators from peers via a Event::PrimaryPing. The call to try_block_sync makes the node think it's synced since there are no block locators to sync from, and thus starts processing proposals/certificates from peers.

@raychu86 raychu86 changed the title [Fix] Do not set is_block_synced to true when there are no sync peers. [Fix] Add grace period before starting block sync May 10, 2024
@raychu86 raychu86 force-pushed the fix/update-is-block-synced branch from 706ded7 to 996b3ba Compare May 10, 2024 18:05
@howardwu howardwu merged commit 6ac2fe5 into mainnet-staging May 10, 2024
24 checks passed
@howardwu howardwu deleted the fix/update-is-block-synced branch May 10, 2024 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants