You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the alertmanager notify stages allow things like dedupe, group by amongst other things. What we have observed is that mutating of alerts carry a lot of interesting possibilities such as:
attach exemplars to each alert via annotations and paint them on the corresponding receiver
group alerts by a certain grouping key and attach a root cause which can be fetched from a certain external API
currently alerts are dispatched parallel to each receiver which means that any stage is concurrently invoked causing map updates to annotations to cause a panic.
some of the things that would be useful:
support mutation stages that are common across all receivers (we would hate to see different sample traces across various receivers)
support grouping of alerts post mutation to facilitate more complex things
Currently things like AlertStoreCallback can be used to achieve some of this but said callback blocks the POST/PUT API which can cause a slow down on the rule manager.
The text was updated successfully, but these errors were encountered:
I would just advice caution about putting blocking operations (such as API) calls as a stage in the notification as it can upset the quite delicate failover semantics when running a cluster of Alertmanagers in high availability mode.
I wonder how much of this can be done before the alerts are sent to the Alertmanager (i.e. in the ruler) using something like annotations? In fact, just last month at our in-person dev submmit, we (the Prometheus contributors) agreed to add support for a third dimension in addition to annotations and labels that are intended for more opaque metadata, an example of which could be exemplars.
Currently the alertmanager notify stages allow things like dedupe, group by amongst other things. What we have observed is that mutating of alerts carry a lot of interesting possibilities such as:
currently alerts are dispatched parallel to each receiver which means that any stage is concurrently invoked causing map updates to annotations to cause a panic.
some of the things that would be useful:
Currently things like
AlertStoreCallback
can be used to achieve some of this but said callback blocks the POST/PUT API which can cause a slow down on the rule manager.The text was updated successfully, but these errors were encountered: