OCPBUGS-82519: Fix chronyd NTP failover#616
Conversation
stop/start chronyd process instead of chronyc offline/online The chronyc offline command was unreliable — chronyd could re-acquire NTP sources despite being told to go offline, causing clock frequency interference with phc2sys (clockcheck events). Replace with process stop/start, using the same pattern as phc2sys. The disable path uses `go p.cmdStop()` (non-blocking) to avoid a deadlock: cmdStop blocks on exitCh, but the ntpfailover FSM callback runs inside the process scanner goroutine which must finish before cmdRun can send to exitCh. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@vitus133: This pull request references Jira Issue OCPBUGS-82519, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vitus133 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@vitus133: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
stop/start chronyd process instead of chronyc offline/online
The chronyc offline command was unreliable — chronyd could re-acquire NTP sources despite being told to go offline, causing clock frequency interference with phc2sys (clockcheck events). Replace with process stop/start, using the same pattern as phc2sys.
The disable path uses
go p.cmdStop()(non-blocking) to avoid a deadlock: cmdStop blocks on exitCh, but the ntpfailover FSM callback runs inside the process scanner goroutine which must finish before cmdRun can send to exitCh.