New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: fix hot ranges logging scheduling interval change #111305
server: fix hot ranges logging scheduling interval change #111305
Conversation
7be912c
to
7d9c5fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling and @xinhaoz)
-- commits
line 30 at r1:
What is the impact if a users hit this?
Previously, j82w (Jake) wrote…
That they will never see the value they changed the interval to go into effect. In practice though I believe this will rarely occur. It's just in the test we immediately override the setting after server startup. I can change the release note to a bug fix. |
Hot ranges telemetry events are on a logging schedule controlled by the cluster setting `TelemetryHotRangesStatsInterval`. A ticker init'd with this interval duration is used to control the logging cycles. After the ticker is initialized, callback fn is registered at the scheduler startup to reset the ticker on any interval setting changes. The scheduler start is part of the async tenant server startup process which means we could run into a scenario where the interval is changed before we register the callback to reset the ticker. If this occurs the ticker will never get updated with the new duration. More concretely, the following can happen: 1. Hot ranges scheduler starts 2. Ticker is initialized with `TelemetryHotRangesStatsInterval` val 3. `TelemetryHotRangesStatsInterval` setting is changed 4. Callback fn registered to watch for `TelemetryHotRangesStatsInterval` changes and reset ticker 5. Scheduler waits for ticker notifications 6. Ticker will never update with the new duration To fix this we ensure we register the callback fn before the ticker is started. The fn is changed to send a chan notification. This ensures that the ticker will always be updated with the most recent interval value. Release note (bug fix): Fixes a potential bug where changing the setting `server.telemetry.hot_ranges_stats.interval` directly after startup is a no-op. Fixes: cockroachdb#111104
7d9c5fd
to
6377707
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling and @j82w)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ericharmeling)
bors r+ |
Build succeeded: |
Hot ranges telemetry events are on a logging schedule controlled by the cluster setting
TelemetryHotRangesStatsInterval
. A ticker init'd with this interval duration is used to control the logging cycles.After the ticker is initialized, callback fn is registered at the scheduler startup to reset the ticker on any interval setting changes. The scheduler start is part of the async tenant server startup process which means we could run into a scenario where the interval is changed before we register the callback to reset the ticker. If this occurs the ticker will never get updated with the new duration.
More concretely, the following can happen:
TelemetryHotRangesStatsInterval
valTelemetryHotRangesStatsInterval
setting is changedTelemetryHotRangesStatsInterval
changes and reset tickerTo fix this we ensure we register the callback fn before the ticker is started. The fn is changed to send a chan notification. This ensures that the ticker will always be updated with the most recent interval value.
Release note (bug fix): Fixes a potential bug where changing the setting
server.telemetry.hot_ranges_stats.interval
directly after startup isa no-op.
Fixes: #111104