-
Notifications
You must be signed in to change notification settings - Fork 90
Bug 1913464: configure a warning level alert for when the CSV phase isn't success #609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1913464: configure a warning level alert for when the CSV phase isn't success #609
Conversation
|
obsoletes #585 |
|
/hold until after change freeze |
|
|
||
| # --------- ElasticsearchCSVNotSuccessful --------- | ||
| - eval_time: 10m | ||
| alertname: ElasticsearchOperatorCSVNotSuccessful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would "ElasticSearchNotReconciled" be more descriptive from a user perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its descriptive enough, we also would be pointing to documentation from the message of the alert (in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for what it's worth, i don't have any attachment to ElasticsearchOperatorCSVNotSuccessful as an alertname. if we want to change it, i'm happy to do so
|
/approve |
| "message": "Elasticsearch Operator CSV has not reconciled succesfully." | ||
| "summary": "Elasticsearch Operator CSV Not Successful" | ||
| "expr": | | ||
| csv_succeeded{name =~ "elasticsearch-operator.*"} == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this expression simply check the status of the csv to denote if it was successful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc, csv_succeeded == 0 is the failed phase
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ewolinetz, yithian The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@yithian: This pull request references Bugzilla bug 1913464, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold cancel |
|
/bugzilla refresh |
|
@ewolinetz: This pull request references Bugzilla bug 1913464, which is valid. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@yithian: All pull requests linked via external trackers have merged: Bugzilla bug 1913464 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@ewolinetz can we backport this to 4.5? |
Description
Includes a warning-level alert for when the elasticsearch-operator CSV's phase is not "success". ie: when
csv_success{name =~ "elasticsearch-operator.*"} == 0.We (OpenShift SRE-P) will configure alertmanager to suppress our alert-level alerts for ElasticsearchClusterNotHealthy while this alert is active.
/cc @alanconway
/assign @ewolinetz
Links