Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upsend firing and resolved status alert to alertmanager in loop #3606
Comments
This comment has been minimized.
This comment has been minimized.
|
Can you tell us exactly where you believe the bug is? |
This comment has been minimized.
This comment has been minimized.
|
when execute alertrule statments , prometheus just believe the data had been stored , but in test environment node-exporter will restart periodly intent to make prometheus lost some data , such as
in this case , prometheus will send inactive status alert which will update resloved time to alertmanager , as result , the alertmanager send alert[resolved] to receivers. so i make some different , before execute alertrule statement i add some codes which parse statment and try to get data from node_statement{target='1.2.3.4'} , when result is empty , it will not go forward . |
This comment has been minimized.
This comment has been minimized.
|
An alert needs to be active on every alert cycle, otherwise it is considered resolved. This is expected behaviour. |
brian-brazil
added
the
kind/question
label
Dec 21, 2017
This comment has been minimized.
This comment has been minimized.
|
Thanks , but if the data which accquired from scrape_job is not exist in prometheus , I don't think it should be considered resolved in this stuation. |
This comment has been minimized.
This comment has been minimized.
|
It sounds like you're looking for alerting with the |
This comment has been minimized.
This comment has been minimized.
strzelecki-maciek
commented
Jan 31, 2018
|
I'll just add my 2 cents here. I have had a funny issue of "rogue-alert-resolver" prometheus instance. I am sharing the alert definitions between nodes. The pair of alerts in question had a "job not up" definition over 2m and 4m. Two nodes are scraping these jobs every minute. The third node was scraping this job every 5 minutes and kept resolving the alerts every couple of minutes. It could not possibly have ever gotten a "job is up" state (the exporters were shut down), yet it kept resolving. Seems like a similar issue. Obviously the job scrape interval needs to be smaller than the alert threshold. On the other hand this is an unexpected behaviour, when possibly a missing (time-misaligned?) data is returning OK state even tho its consistently getting DOWN state. |
brian-brazil
closed this
Apr 18, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
yylt commentedDec 21, 2017
What did you do?
i create an scrape job which to collect information from an unreachable node , and prometheus send alerts to alertmanager
What did you expect to see?
prometheus send firing statue alert or nothing to alertmanager always , and will never send resolved status alert to alertmanager.
What did you see instead? Under which circumstances?
in fact , what i see is same to this issue
prometheus/alertmanager#952
Environment
prometheus v1.8.2
alertmanage v0.11.0
System information:
linux- 3.10.0
Prometheus configuration file:
prometheus -alertmanager.url=http://alertmanager-service:9093 -web.listen-address=:9091 -config.file=/etc/prometheus/prometheus.yaml
inhibit_rules:
severity: 'critical'
target_match:
severity: 'warning'
equal ['alertname', 'service']