You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And "Cluster Status" status is ready on all alertmanager node.
What did you expect to see?
While instance down alert firing, just receiving one notification.
What did you see instead? Under which circumstances?
While instance down alert firing, sometimes receiving two notification. (sometimes receiving one notification.)
Environment
System information:
Linux 4.12.14-94.41-default x86_64
Alertmanager version:
alertmanager, version 0.20.0 (branch: HEAD, revision: f74be04)
build user: root@00c3106655f8
build date: 20191211-14:13:14
go version: go1.13.5
Prometheus version:
prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec)
build user: root@7ea0ae865f12
build date: 20200213-23:50:02
go version: go1.13.8
I had some luck with minimizing duplicate notifications by tweaking the --cluster.pushpull-interval and --cluster.gossip-intervalflags in the alertmanager startup command to values other than the default. I started with defaults of 1m0s and 200ms respectively and vastly changed them until I got either more notifications or less, then slowly narrowed it down. It was quite painstaking.
To me, it seems to be related to the latency between the alertmanagers over the wire. For instance, I have 4 alertmangers communicating over a tunnel between NYC and CA and sometimes it's fast.. but sometimes, because of high ISP latency, their communication is slow. It would be nice to know if you have the same luck. I still get duplication of 2 to 3 notifications occasionally but I'd rather get multiple alerts than none.
@devinodaniel
thank for your help. My nodes is in a same LAN, so low latency between the alertmanagers over the wire. At present, I have a low probability of receiving repeated notifications(about 5%), according to your description, it seems that repeated notifications is inevitable, I will try your suggestions.
What did you do?
3 Prometheus nodes for HA
3 Alertmanager nodes for HA
alert01 startup command:
/bin/alertmanager --config.file=/etc/alertmanager/config.yml --storage.path=/alertmanager --log.level=debug --cluster.listen-address=prom01:9094
alert02 startup command:
/bin/alertmanager --config.file=/etc/alertmanager/config.yml --storage.path=/alertmanager --log.level=debug --cluster.listen-address=prom02:9094 --cluster.peer=prom01:9094
alert03 startup command:
/bin/alertmanager --config.file=/etc/alertmanager/config.yml --storage.path=/alertmanager --log.level=debug --cluster.listen-address=prom03:9094 --cluster.peer=prom01:9094
And "Cluster Status" status is ready on all alertmanager node.
What did you expect to see?
While instance down alert firing, just receiving one notification.
What did you see instead? Under which circumstances?
While instance down alert firing, sometimes receiving two notification. (sometimes receiving one notification.)
Environment
System information:
Linux 4.12.14-94.41-default x86_64
Alertmanager version:
alertmanager, version 0.20.0 (branch: HEAD, revision: f74be04)
build user: root@00c3106655f8
build date: 20191211-14:13:14
go version: go1.13.5
prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec)
build user: root@7ea0ae865f12
build date: 20200213-23:50:02
go version: go1.13.8
The text was updated successfully, but these errors were encountered: