You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
generates many alerts in our slack channel. On fiddling with this expr and sub exprs in prometheus indicates that the query will succeed if ANY pod that matches the criteria goes down. This happens every time we deploy since the old pods will go down and therefore be 0 instead of 1.
The recommended alert:
expr: up{app="", job="kubernetes-pods"} == 0
<...snip...>
Using this example for our app like this:
up{app="skjemautfylling", namespace="skjemadigitalisering"} == 0
generates many alerts in our slack channel. On fiddling with this expr and sub exprs in prometheus indicates that the query will succeed if ANY pod that matches the criteria goes down. This happens every time we deploy since the old pods will go down and therefore be 0 instead of 1.
Replacing the query with:
sum(up{app="skjemautfylling", namespace="skjemadigitalisering"}) < 2
seems to work better. (Where 2 is the number of pods you expect to be always up)
The text was updated successfully, but these errors were encountered: