Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaaS: Disable alarms/checks with regex #1691

Merged
merged 1 commit into from Jan 23, 2017

Conversation

major
Copy link
Contributor

@major major commented Jan 19, 2017

This patch allows for regular expressions to be used in the
maas_excluded_checks variable.

It also adds maas_excluded_alarms and allows deployers to use a
regular expression to exclude only alarms for a particular check.
The check itself (and any related graphing) is maintained.

This comes from a discussion in rcbops/u-suk-dev#948 where some of the
checks are good to have, but the alarms aren't needed.

Connects rcbops/u-suk-dev#1019

@major major self-assigned this Jan 19, 2017
@major
Copy link
Contributor Author

major commented Jan 19, 2017

This patch is up as a RFC. Obviously, the conditional belongs in each check file, but I've added it to just one of them to generate some discussion.

@npawelek and @BjoernT -- let me know if this is going in the right direction.

@BjoernT
Copy link
Contributor

BjoernT commented Jan 19, 2017

It seems we should probably support regex so we can disable alarms per subset of host or very specifically rather than just disabling all alarms of a given name

@major
Copy link
Contributor Author

major commented Jan 19, 2017

That's a decent idea, @BjoernT. I guess we could make maas_excluded_alarms into a list of regexes perhaps? I'd have to loop through and apply the regex to the name of the alarm each time.

@npawelek
Copy link

I like that idea @BjoernT.

@major For alarms that are included in the list, will the alarm still be included in the server-side yaml or will it iterate over and be excluded? If excluded, one thing that comes to mind is if a yaml were to include only check and no alarm definition, it could be portrayed that it's not in-use and may be subject to be deleted by a member of support (this is an education piece). It may be worth noting that alarms section within the yaml definition can also have a disabled boolean, it's just that none of our templates include it. It would probably be a good idea to include the alarm, but mark it as disabled. This also ties back to https://github.com/rcbops/u-suk-dev/issues/948, "Agent-based check/alarm configuration (yamls) are missing ‘disabled’ flag under each alarm section."

@prometheanfire
Copy link
Contributor

@npawelek ya, that does sound like the preferred way of disabling an alarm.

@major the regex solution sounds like what we need, but I'm not sure how much that that will complicate the logic in the template though.

@major
Copy link
Contributor Author

major commented Jan 19, 2017

Let me know how this patch looks. I've only applied it to one of the checks.

Copy link
Contributor

@prometheanfire prometheanfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better :D

@@ -6,6 +6,7 @@ timeout : "{{ maas_check_timeout }}"
alarms :
idle_percent_average :
label : idle_percent_average--{{ inventory_hostname|quote }}
disabled : {{ idle_percent_average | match(maas_excluded_alarms_regex) | ternary('true', 'false') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really wish the maas agent didn't make us use double negatives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we can't fix this. :trollface:

@prometheanfire prometheanfire self-assigned this Jan 19, 2017
@major
Copy link
Contributor Author

major commented Jan 19, 2017

Oof, I have some "order of operations" issues going on here.

@major
Copy link
Contributor Author

major commented Jan 20, 2017

recheck_ceph

Ceph mirrors have some shenanigans.

@major major force-pushed the mhayden-usukdev-1019 branch 3 times, most recently from df26d1b to 4e79469 Compare January 20, 2017 19:40
@major major changed the title [RFC] Allow MaaS alarms to be disabled Allow MaaS alarms to be disabled Jan 20, 2017
@major major changed the title Allow MaaS alarms to be disabled Disable MaaS alarms/checks with regex Jan 20, 2017
@major
Copy link
Contributor Author

major commented Jan 20, 2017

I've heavily tested this on a AIO without errors.

@BjoernT and @npawelek: This should allow you to disable a check + alarm, or just the alarm, based on a regular expression.

This patch allows for regular expressions to be used in the
`maas_excluded_checks` variable.

It also adds `maas_excluded_alarms` and allows deployers to use a
regular expression to exclude **only** alarms for a particular check.
The check itself (and any related graphing) is maintained.

This comes from a discussion in rcbops/u-suk-dev#948 where some of the
checks are good to have, but the alarms aren't needed.

Connects rcbops/u-suk-dev#1019
@BjoernT
Copy link
Contributor

BjoernT commented Jan 20, 2017

I take your word that you have heavily tested, I don't have the cycles for it now. So thanks and 👍

@major major changed the title Disable MaaS alarms/checks with regex MaaS: Disable alarms/checks with regex Jan 23, 2017
@Frank-ZhangXin Frank-ZhangXin merged commit 254243a into rcbops:master Jan 23, 2017
@major major deleted the mhayden-usukdev-1019 branch January 23, 2017 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants