Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upIs there a way let rule support levels #1989
Comments
grobie
added
kind/question
component/rules
labels
Sep 14, 2016
This comment has been minimized.
This comment has been minimized.
|
The common way for this use case is to define two alerts with different thresholds and labels. You can use the same alertname in both rules, which makes it easier to create silences and inhibit rules in alertmanager. You can use configuration management systems to script / automate the creation of such similar rules.
Alerting on high load is usually very noisy and will quickly lead to alerting fatigue. It's better to alert on symptoms then causes, I recommend reading this document for more information on this topic: https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/edit |
This comment has been minimized.
This comment has been minimized.
|
thanks @grobie There is my rules: ALERT InstanceLoad
IF node_load1{job="node"} > 20 and node_load1{job="node"} <= 30
FOR 10s
LABELS {
event_id = "E2.1.3",
type = "server",
subtype = "load",
resource="{{$labels.instance}}/load",
level = "Error",
threshold = "20",
value = "{{$value}}",
resolved_threshold = "20",
instance="{{$labels.instance}}"
}
ALERT InstanceLoad
IF node_load1{job="node"} > 30
FOR 10s
LABELS {
event_id = "E2.1.3",
type = "server",
subtype = "load",
resource="{{$labels.instance}}/load",
level = "Critical",
threshold = "30",
value = "{{$value}}",
resolved_threshold = "20",
instance="{{$labels.instance}}"
}With this rule config, I can get error and critical alert with alerting value. But no real resolved info. I just try: The But with prometheus solved logic, firing alerting not existing again, it means solved. So I can't get the value when solved. I use webhook to process alert json data to check real solved in my self app. |
This comment has been minimized.
This comment has been minimized.
aecolley
commented
Sep 18, 2016
|
@songjiayang If you want to lookup alerts which have cleared, use the If you want the values when the alerts clear, you can use the fact that |
This comment has been minimized.
This comment has been minimized.
|
Thank @aecolley , I will try. |
brian-brazil
closed this
Feb 13, 2017
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
songjiayang commentedSep 14, 2016
•
edited
I am big fans of prometheus , i want to use prometheus rules and alert system , but I find the rules doesn't support different levels.
Example with
load1:If instance
load1is in 20~30 alert aserror, if bigger than 30 , alert ascritical. The resolved alert only sent justload1< 20 .Can I do some configure work make it works ? anybody can help me.