Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended up alert seems to be buggy #126

Closed
7er opened this issue Oct 26, 2020 · 1 comment
Closed

Recommended up alert seems to be buggy #126

7er opened this issue Oct 26, 2020 · 1 comment

Comments

@7er
Copy link
Contributor

7er commented Oct 26, 2020

The recommended alert:

  • alert: applikasjon nede
    expr: up{app="", job="kubernetes-pods"} == 0
    <...snip...>

Using this example for our app like this:

up{app="skjemautfylling", namespace="skjemadigitalisering"} == 0

generates many alerts in our slack channel. On fiddling with this expr and sub exprs in prometheus indicates that the query will succeed if ANY pod that matches the criteria goes down. This happens every time we deploy since the old pods will go down and therefore be 0 instead of 1.

Replacing the query with:

sum(up{app="skjemautfylling", namespace="skjemadigitalisering"}) < 2

seems to work better. (Where 2 is the number of pods you expect to be always up)

@Kyrremann
Copy link
Contributor

Fikset her: https://docs.nais.io/observability/alerts/#kubernetes-resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants