Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise alerts granularity and what gets alerted on #115

Open
einari opened this issue Oct 7, 2018 · 0 comments
Open

Revise alerts granularity and what gets alerted on #115

einari opened this issue Oct 7, 2018 · 0 comments
Milestone

Comments

@einari
Copy link

einari commented Oct 7, 2018

Today we're getting a lot of alerts that aren't really problems. We need to refine these so that we can trust that when there is an alert, there is a reason for it.

The alert manager itself seems like a bit of an instabile piece of software since it crashes all the time. Kubernetes brings it up again immediately, so its not really down for much.

Optimally, the alert manager shouldn't be down - secondly, it would be more useful if we reported if it or any other pod does not get back after a defined threshold.

┆Issue is synchronized with this Asana task

@einari einari added this to the 2.2.0 milestone Oct 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants