Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

printf query on resolved alert not showing current state #3906

Closed
ojle opened this Issue Mar 2, 2018 · 6 comments

Comments

Projects
None yet
2 participants
@ojle
Copy link

ojle commented Mar 2, 2018

I have following alert rule:

groups:
- name: loadavg
  rules:
  - alert: HighLoadAvg
    expr: ((system_load5 / count without (cpu, mode) (cpu_usage_idle{cpu!="cpu-total"})) > 1.3)
    for: 2m
    labels:
      severity: warning
    annotations:
      description : 'load average: {{ printf `system_load1{instance="%s"}` $labels.instance | query | first | value }}, {{ printf `system_load5{instance="%s"}` $labels.instance | query | first | value }}, {{ printf `system_load15{instance="%s"}` $labels.instance | query | first | value }}'

When triggered I get in email current load stats:
WARNING - load average: 4.16, 3.54, 2.25
When resolved I also get current load stats:
OK - load average: 1.19, 2.64, 2.79

If I try similar rule for disk space, ie.

description : 'disk space: {{ $labels.path }} {{ printf `disk_used_percent{path="/tmp",server="%s"}` $labels.server | query | first | value | humanize }}%'

I only get current stat when alert is triggered, ie.
WARNING - disk space: /tmp 82.72%
and then
OK - disk space: /tmp 82.72%

Am I missing something here?
How/why is the query on resolved alert for system_load being evaluated?

running prometheus 2.0.0

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 2, 2018

I see nothing odd here, a 1 or 2 core machine can produce such output.

@ojle

This comment has been minimized.

Copy link
Author

ojle commented Mar 2, 2018

But with disk_space alert, once it was resolved, I was suppose to get current value as I did with load average.
So instead of:
OK - disk space: /tmp 82.72%
I should get:
OK - disk space: /tmp 40%

To be clear, load average alert when resolved gave current server load state with
{{ printf `system_load1{instance="%s"}` $labels.instance | query | first | value }} ....
and disk free alert once resolved it didn't gave disk current state with
{{ printf `disk_used_percent{path="/tmp",server="%s"}` $labels.server | query | first | value | humanize }}%'

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 2, 2018

Resolved alerts do not have annotations set, you are misinterpreting what's going on.

@ojle

This comment has been minimized.

Copy link
Author

ojle commented Mar 3, 2018

Ok, then do you have valid explanation what is going on here?
If it is as you are saying, then when the alert triggered with message:
WARNING - load average: 4.16, 3.54, 2.25
once was resolved, I was suppose to get:
OK - load average: 4.16, 3.54, 2.25
right?

And instead I got:
OK - load average: 1.19, 2.64, 2.79

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 3, 2018

Another evaluation of the alert updated the annotations.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.