Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Fix per-slot timer in presence of clock changes #3243

Closed
wants to merge 1 commit into from

Conversation

michaelsproul
Copy link
Member

@michaelsproul michaelsproul commented Jun 5, 2022

Issue Addressed

Fixes a timing issue that results in spurious fork choice notifier failures:

WARN Error signalling fork choice waiter     slot: 3962270, error: ForkChoiceSignalOutOfOrder { current: Slot(3962271), latest: Slot(3962270) }, service: beacon

There’s a fork choice run that is scheduled to run at the start of every slot by the timer, which creates a 12s interval timer when the beacon node starts up. The problem is that if there’s a bit of clock drift that gets corrected via NTP (or a leap second for that matter) then these 12s intervals will cease to line up with the start of the slot. This then creates the mismatch in slot number that we see above.

Lighthouse also runs fork choice 500ms before the slot begins, and these runs are what is conflicting with the start-of-slot runs. This means that the warning in current versions of Lighthouse is mostly cosmetic because fork choice is up to date with all but the most recent 500ms of attestations (which usually isn’t many).

Proposed Changes

Fix the per-slot timer so that it continually re-calculates the duration to the start of the next slot and waits for that.

A side-effect of this change is that we may skip slots if the per-slot task takes >12s to run, but I think this is an unlikely scenario and an acceptable compromise.

@michaelsproul michaelsproul added bug Something isn't working ready-for-review The code is ready for review low-hanging-fruit Easy to resolve, get it before someone else does! labels Jun 5, 2022
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Jun 6, 2022
@michaelsproul
Copy link
Member Author

bors r+

bors bot pushed a commit that referenced this pull request Jun 6, 2022
## Issue Addressed

Fixes a timing issue that results in spurious fork choice notifier failures:

```
WARN Error signalling fork choice waiter     slot: 3962270, error: ForkChoiceSignalOutOfOrder { current: Slot(3962271), latest: Slot(3962270) }, service: beacon
```

There’s a fork choice run that is scheduled to run at the start of every slot by the `timer`, which creates a 12s interval timer when the beacon node starts up. The problem is that if there’s a bit of clock drift that gets corrected via NTP (or a leap second for that matter) then these 12s intervals will cease to line up with the start of the slot. This then creates the mismatch in slot number that we see above.

Lighthouse also runs fork choice 500ms before the slot begins, and these runs are what is conflicting with the start-of-slot runs. This means that the warning in current versions of Lighthouse is mostly cosmetic because fork choice is up to date with all but the most recent 500ms of attestations (which usually isn’t many).

## Proposed Changes

Fix the per-slot timer so that it continually re-calculates the duration to the start of the next slot and waits for that.

A side-effect of this change is that we may skip slots if the per-slot task takes >12s to run, but I think this is an unlikely scenario and an acceptable compromise.
@bors bors bot changed the title Fix per-slot timer in presence of clock changes [Merged by Bors] - Fix per-slot timer in presence of clock changes Jun 7, 2022
@bors bors bot closed this Jun 7, 2022
@michaelsproul michaelsproul deleted the per-slot-timer branch June 7, 2022 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working low-hanging-fruit Easy to resolve, get it before someone else does! ready-for-merge This PR is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants