Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmalert: add "evaluation_offset" option to group configuration #3409

Closed
stevenbeaullieu opened this issue Nov 29, 2022 · 8 comments
Closed
Assignees
Labels
enhancement New feature or request vmalert

Comments

@stevenbeaullieu
Copy link

Is your feature request related to a problem? Please describe.
For recording rules with a high interval, say > 5m, it would be desirable to be able to specify when they evaluate. One issue we had was with a 60m interval, it was evaluating toward the end of the hour, but writing with a timestamp at the top of the hour. This was excluding it from federation because of staleness. If we could give it an evaluation-offset of 2m or something like that, that would make sure it evaluates and writes the result within the first 5m of the hour so it is included in Federation. Other use cases are recording rules that run once every 24 hours to produce metrics for other things to consume. Since the metrics need to be available before what depends on them, it makes sense to be able to specify when they run rather than let it be random.

Describe the solution you'd like
Add a configuration option to group configurations called "evaluation_offset" that works similar to the "scrape_offset" for vmagent scrape configurations. Instead of an algorithm choosing when the rules run, they would run at the specified offset within their interval. So a 1m interval with an evaluation_offset of 0s would run at the top of the minute, or a 60m interval with an evaluation_offset of 2m would run at 2m after the hour every hour. This would be paired with the existing -datasource.queryTimeAlignment to either keep the timestamp at the top of the interval, or have the timestamp match the offset exactly if alignment is disabled.

Describe alternatives you've considered
We've worked around this for now, but could simplify some solutions with this ability.

Additional context
N/A

@hagen1778 hagen1778 added enhancement New feature or request help wanted Extra attention is needed vmalert labels Nov 29, 2022
@spotifyprism
Copy link

@hagen1778 I can take a look at this

@spotifyprism
Copy link

spotifyprism commented Mar 31, 2023

@stevenbeaullieu just wanted to clarify here. From my understanding datasource.queryTimeAlignment & groupIntervalOffset would be mutually exclusive params. So as per the above example you would set both an interval offset to x and set the datasource.queryTimeAlignment` to false to prevent the timestamp from being written at the top of the interval. Correct ?

@stevenbeaullieu
Copy link
Author

@spotifyprism Yes, but you could also keep queryTimeAlignment as true and it would write at the top of the interval and evaluate at the offset. queryTimeAlignment would keep doing what it's doing and this new configuration would just override when the rule is evaluated.

In hindsight our original problem would likely have been fixed by setting queryTimeAlignment to false, but we weren't sure about other repercussions of that. I'm guessing there would have been none, but didn't have the appetite to change it at the time.

For the original problem, we no longer have recording rules with an interval > 5m, so we haven't had an issue with staleness anymore. I could still see such a configuration being useful to mirror the capability with vmagent.

@spotifyprism
Copy link

spotifyprism commented Mar 31, 2023

yeah sounds good, queryTimeAlignment & rule evaluation are two separate mutually exclusive problems.
Yup, I think where this feature might be useful is when a user wants to have some kind of control of when the rules in a group get evaluated within the group interval vs it being totally random

@Haleygo
Copy link
Collaborator

Haleygo commented Jul 22, 2023

Yes, but you could also keep queryTimeAlignment as true and it would write at the top of the interval and evaluate at the offset.

Sounds weird but true I think.
queryTimeAlignment was introduced to produce deterministic results despite number of vmalert replicas or time they were started. So if I have a group with interval 1h and start at 11:58, group evaluation get series from 11:00(firstTS) as inputs, write result with 11:58(secondTS).
And now group has eval_offset, it shouldn't affect the firstTS cause we still want the result to be consistent[users can turn off queryTimeAlignment if they don't want it], so it will only change the secondTS to secondTS+eval_offset.
@hagen1778 wdyt?

so it will only change the secondTS to secondTS+eval_offset.

Update:
If offset is set, evalTS will be aligned by group_interval first then use the nearest possible timestamp with offset. And queryTimeAlignment/lookBack has no effect on group with eval_offset.
Still go with the example group which has group_interval: 1h and offset: 5m.

  1. group start at 11:02, 11:02%1h < 5m, result will be generated with timestamp 11:05 using raw series from 11:05[evaluation time is unsure but must bigger that group_offset to ensure that raw series data is there already to avoid partial response].
  2. group start at 11:06, 11:06%1h >= 5m, rule will be evaluate at 12:05[sleep 59 minutes] for the first time with raw series from 12:05.

@hagen1778
Copy link
Collaborator

And now group has eval_offset, it shouldn't affect the firstTS cause we still want the result to be consistent[users can turn off queryTimeAlignment if they don't want it], so it will only change the secondTS to secondTS+eval_offset.

I think it will be not secondTS+eval_offset but eval_interval % eval_offset. For example with 11:58 (secondTS) and 11:00 (firstTS), user would be able to set eval_offset: 1m and secondTS will become 11:01.

In other words: at 11:01 vmalert will execute expression with time=11:00 and will persist its results with time=11:00.

If user disables queryTimeAlignment, then it will become the following: at 11:01 vmalert will execute expression with time=11:01 and will persist its results with time=11:01.

hagen1778 added a commit that referenced this issue Sep 6, 2023
Adds `eval_offset` attribute for Groups. 
If specified, Group will be evaluated at the exact time offset on the range of [0...evaluationInterval]. 
The setting might be useful for cron-like rules which must be evaluated at specific moments of time. 

#3409

Signed-off-by: Haley Wang <pipilong.25@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
@hagen1778
Copy link
Collaborator

@stevenbeaullieu the support of eval_offset was added in 45c0e4b. It will be likely included in the next major release. Meanwhile, you can try building vmalert from sources.

@hagen1778 hagen1778 removed the help wanted Extra attention is needed label Sep 6, 2023
hagen1778 pushed a commit that referenced this issue Sep 7, 2023
Adds `eval_offset` attribute for Groups.
If specified, Group will be evaluated at the exact time offset on the range of [0...evaluationInterval].
The setting might be useful for cron-like rules which must be evaluated at specific moments of time.

#3409

Signed-off-by: Haley Wang <pipilong.25@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 45c0e4b)
@valyala
Copy link
Collaborator

valyala commented Oct 2, 2023

The eval_offset option is available in vmalert starting from v1.94.0. See these docs for more details.

Closing the feature request as done.

@valyala valyala closed this as completed Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request vmalert
Projects
None yet
Development

No branches or pull requests

5 participants