Alertmanager peerReconnectTimeout config #107

jquick · 2023-06-23T22:12:52Z

We are using Grafana in HA with alerting in kubernetes. This enables the alertmanager gossip network. Alertmanger has a default of:

--cluster.reconnect-timeout value: length of time to attempt to reconnect to a lost peer (default: "6h0m0s")

Ref: https://github.com/prometheus/alertmanager/blob/main/cmd/alertmanager/main.go#L230

This works fine except due to the turn over of pods the gossip network will try to hit any expired/terminated pods for 6 hours every 10 seconds. I didn't see an easy way to edit/change this but it would be nice if this were configurable.

The text was updated successfully, but these errors were encountered:

jquick · 2023-06-24T22:22:38Z

going to open this on the main project

jquick closed this as completed Jun 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alertmanager peerReconnectTimeout config #107

Alertmanager peerReconnectTimeout config #107

jquick commented Jun 23, 2023

jquick commented Jun 24, 2023

Alertmanager peerReconnectTimeout config #107

Alertmanager peerReconnectTimeout config #107

Comments

jquick commented Jun 23, 2023

jquick commented Jun 24, 2023