New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-13696: Warn about CBT enabled VMs via vsphere-problem-detector #371
OCPBUGS-13696: Warn about CBT enabled VMs via vsphere-problem-detector #371
Conversation
@vr4manta: This pull request references Jira Issue OCPBUGS-13696, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (wduan@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Skipping CI for Draft Pull Request. |
cc @jsafrane |
- alert: VSphereOpenshiftVmsCBTMismatch | ||
# Using min_over_time to make sure the metric is `1` for whole 5 minutes. | ||
# A missed scraping (e.g. due to a pod restart) will result in prometheus re-evaluating the the alerting rule. | ||
expr: min_over_time(vsphere_vm_cbt_checks{cbt=~"enabled"}[5m]) > 0 and ignoring (cbt) min_over_time(vsphere_vm_cbt_checks{cbt=~"disabled"}[5m]) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why we need ignoring(cbt)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO and on()
should work here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just realized we didn't finish this. Without ignore, it would not work. I did a quick research on how to get this "and" clause to work and this was a recommendation. I can try on to see if that works. Is there a reason to avoid ignoring?
To clarify why I used ignore, in my attempt to create this alert, I learned that I cannot do an "and" against the same metric with different variable values. In this case doing enabled vs disabled. From my research, after the and, by default the expression wants the "cbt" variable to be same value. So by putting the ignore, its ignoring the cbt values from before "and". Looking into on() function, it looks like this may be a similar way of solving the problem. I am currently looking into recreating issue again and verify on() function will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested using on() function and it is working. I've checked in that version based on your recommendation. Thanks!
Renamed mismatch to match changes to code Renamed mismatch to match changes to code
@vr4manta: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, vr4manta The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ab03ebe
into
openshift:master
@vr4manta: Jira Issue OCPBUGS-13696: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-13696 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Added new alert for CBT Check