Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Enable group-level rule evaluation jittering by default, remove feature toggle #82212

Merged
merged 4 commits into from
Feb 9, 2024

Conversation

alexweav
Copy link
Contributor

@alexweav alexweav commented Feb 8, 2024

What is this feature?

This PR removes the jitterAlertRules feature toggle. The new default behavior is as if it's enabled.
The jitter behavior was introduced in #80766

The effect is that the evaluation time of rule groups are distributed over their group interval. Rules within a group will still evaluate together on all replicas, but this time is smoothed out over the rule's evaluation interval rather than synchronized with all other rules on the entire system.

This feature has been globally active in Grafana Cloud for a while now, without issue. It heavily reduces load on datasources by decreasing query spikes. The lack of spiky evaluations improves alert evaluation time and notification p99 time by roughly 10% across the board.

The jitter is now on by default, but can still be disabled with disable_jitter under the [unified_alerting] config section.

Which issue(s) does this PR fix?:

Fixes #53744

Special notes for your reviewer:

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

@alexweav alexweav added this to the 10.4.x milestone Feb 8, 2024
@alexweav alexweav requested review from grafanabot and a team as code owners February 8, 2024 22:50
@alexweav alexweav requested review from a team, rwwiv, JacobsonMT, yuri-tceretian and grobinson-grafana and removed request for a team February 8, 2024 22:50
@grafana-pr-automation grafana-pr-automation bot added area/frontend type/docs Flags the technical writing team for documentation support; auto adds to org-wide docs project labels Feb 8, 2024
@alexweav alexweav requested review from torkelo and a team as code owners February 9, 2024 20:04
@alexweav alexweav requested review from diegommm, undef1nd and suntala and removed request for a team February 9, 2024 20:04
@alexweav alexweav merged commit 5bbe9c6 into main Feb 9, 2024
20 checks passed
@alexweav alexweav deleted the alexweav/jitter-always branch February 9, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog add to what's new area/alerting Grafana Alerting area/backend area/frontend no-backport Skip backport of PR type/docs Flags the technical writing team for documentation support; auto adds to org-wide docs project
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Improve alert scheduling in Grafana Alerting
3 participants