Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Fix scheduler to sort rules before evaluation #88006

Merged
merged 1 commit into from
May 17, 2024

Conversation

yuri-tceretian
Copy link
Contributor

@yuri-tceretian yuri-tceretian commented May 16, 2024

What is this feature?
This PR fixes Grafnaa Alerting scheduler to sort the rules that are scheduled for the current tick before spreading them out.

Why do we need this feature?
This is needed to guarantee that rule is consistently evaluated every X interval. The order of items scheduled for evaluation is important because the index of element is used to calculate a delay. That delay can be up to 10 second (base interval of tick). Therefore, one time rule can be evaluated at time X + 5 second, and the next time, for example, Y = X+eval interval, at Y + 2 seconds and therefore the rule is evaluated 3 seconds earlier than it is supposed to.

This is especially important in HA mode where we rely on the fact that rules are evaluated at the same time.

Who is this feature for?
Alerting users

Special notes for your reviewer:

One possible downside of this change is that now all instances in HA mode will send the same query at pretty much the same time, which may cause some contention of the data source side.

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

to make sure that the order is stable between evaluations.
This is especially important in HA mode.
@yuri-tceretian yuri-tceretian added type/bug area/alerting Grafana Alerting add to changelog backport v11.0.x Mark PR for automatic backport to v11.0.x labels May 16, 2024
@yuri-tceretian yuri-tceretian self-assigned this May 16, 2024
@yuri-tceretian yuri-tceretian requested a review from a team as a code owner May 16, 2024 20:35
@yuri-tceretian yuri-tceretian requested review from jtheory, rwwiv, JacobsonMT and grobinson-grafana and removed request for a team May 16, 2024 20:35
@grafana-delivery-bot grafana-delivery-bot bot added this to the 11.1.x milestone May 16, 2024
Copy link
Contributor

This PR must be merged before a backport PR will be created.

Copy link
Contributor

This PR must be merged before a backport PR will be created.

@yuri-tceretian yuri-tceretian changed the title Alerting: Fix scheduler to sort rules scheduled for evaluation Alerting: Fix scheduler to sort rules before evaluation May 16, 2024
@yuri-tceretian yuri-tceretian modified the milestones: 11.1.x, 10.0.x May 17, 2024
@yuri-tceretian yuri-tceretian merged commit 05d6813 into main May 17, 2024
28 checks passed
@yuri-tceretian yuri-tceretian deleted the yuri-tceretian/sort-scheduled-rules branch May 17, 2024 15:38
grafana-delivery-bot bot pushed a commit that referenced this pull request May 17, 2024
sort rules scheduled for evaluation to make sure that the order is stable between evaluations.
This is especially important in HA mode.

(cherry picked from commit 05d6813)
yuri-tceretian added a commit that referenced this pull request May 20, 2024
…8021)

* Alerting: Fix scheduler to sort rules before evaluation (#88006)

sort rules scheduled for evaluation to make sure that the order is stable between evaluations.
This is especially important in HA mode.

(cherry picked from commit 05d6813)

* use old generators

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog area/alerting Grafana Alerting area/backend backport v11.0.x Mark PR for automatic backport to v11.0.x type/bug
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants