Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the Ops Car Alarms (Noisy Ignored Alerts) #865

Open
Firefishy opened this issue Mar 23, 2023 · 4 comments
Open

Reduce the Ops Car Alarms (Noisy Ignored Alerts) #865

Firefishy opened this issue Mar 23, 2023 · 4 comments
Labels
help-wanted Issues where help is needed to implement them

Comments

@Firefishy
Copy link
Member

Firefishy commented Mar 23, 2023

There are many spurious car alarms (Ops Ignored Alerts)

TO BE COMPLETED with examples

  • Alertmanager may need additional tuning
  • Team should be aware how to silence.
  • Reduce other alerts?
  • Cronjob emails?
  • Chef email alerts?
@Firefishy Firefishy added the help-wanted Issues where help is needed to implement them label Mar 23, 2023
@tomhughes
Copy link
Member

I do tune them as best I can but there are some which it's very hard to get right - any noise is not for want of trying!

@Firefishy
Copy link
Member Author

Not a criticism of the improvements at all. I am partially to blame as some of my stuff has been needlessly alerting until I finally started using Alertmanager silencing.

@tomhughes
Copy link
Member

I do wish alertmanager had nagios's "acknowledge" feature where it silences it not for a fixed time, but until the alarm clears and then it resets and will alert again if it retriggers.

It's great for things like hardware faults where you don't know how long they will take to fix - you log a ticket or whatever and then acknowledge the alert and as soon as it is fixed the alert rearms.

@tomhughes
Copy link
Member

One thing that I would like to get rid of is the old hwraid monitors and their alerts given we have prometheus alerting on degraded arrays now, so Think cciss-vol-statusd and the like can go now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help-wanted Issues where help is needed to implement them
Projects
None yet
Development

No branches or pull requests

2 participants