Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate an Event, and a Prometheus based Alert for non evictable VMs #5392

Merged
merged 7 commits into from Apr 20, 2021

Conversation

ezrasilvera
Copy link
Contributor

@ezrasilvera ezrasilvera commented Apr 5, 2021

What this PR does / why we need it:
When the VM eviction strategy is set to LiveMigration and the VM is not migratable currently this is detected only when the eviction is performed.
We want to alert the user/administrator for about this situation as soon as it happen (either when the VM is created or when the VM status is changed to not migratable)
In order to generate the Alert I added a new metric to monitor the VM eviction blocking status.
In addition to the Prometheus alert I also added a regular event to the corresponding virt-handler.

Special notes for your reviewer:

Release note:

"NONE"

@kubevirt-bot kubevirt-bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. dco-signoff: no Indicates the PR's author has not DCO signed all their commits. size/L labels Apr 5, 2021
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note-none Denotes a PR that doesn't merit a release note. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 5, 2021
@ezrasilvera
Copy link
Contributor Author

/retest

1 similar comment
@ezrasilvera
Copy link
Contributor Author

/retest

@ezrasilvera
Copy link
Contributor Author

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 6, 2021
@ezrasilvera
Copy link
Contributor Author

/retest

@ezrasilvera
Copy link
Contributor Author

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 6, 2021
@ezrasilvera
Copy link
Contributor Author

/retest

@ezrasilvera ezrasilvera changed the title Generate and Event, and a Prometheus based Alert for non evictable VMs Generate an Event, and a Prometheus based Alert for non evictable VMs Apr 7, 2021
@ezrasilvera ezrasilvera force-pushed the alert-eviction-blocker branch 2 times, most recently from 1011abe to 9929b02 Compare April 7, 2021 20:21
@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2021
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
… migration

Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
…able mentric

Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2021
…atus of a VM

Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
@sonarcloud
Copy link

sonarcloud bot commented Apr 7, 2021

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@ezrasilvera
Copy link
Contributor Author

/retest

1 similar comment
@ezrasilvera
Copy link
Contributor Author

/retest

@yuhaohaoyu
Copy link
Contributor

/lgtm

@kubevirt-bot
Copy link
Contributor

@yuhaohaoyu: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuhaohaoyu
Copy link
Contributor

/lgtm

@kubevirt-bot
Copy link
Contributor

@yuhaohaoyu: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuhaohaoyu
Copy link
Contributor

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 13, 2021
@yuhaohaoyu
Copy link
Contributor

/approve

Annotations: map[string]string{
"description": "Eviction policy for {{ $labels.name }} (on node {{ $labels.node }}) is set to Live Migration but the VM is not migratable",
"summary": "The VM's eviction strategy is set to Live Migration but the VM is not migratable",
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to attempt to succinctly explain to the admin why this is a problem, that this would cause difficulty during node drain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do that as part of the run-book, where we include full details on the issue (including overview, mitigation, etc.) . Here we just explain the alert itself. If you still think it should go here as well I can add it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR for the runbook is here: kubevirt/monitoring#6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. fair point.

/lgtm

@stu-gott
Copy link
Member

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stu-gott, yuhaohaoyu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 20, 2021
@stu-gott
Copy link
Member

/test pull-kubevirt-e2e-k8s-1.19-sig-storage

@kubevirt-bot kubevirt-bot merged commit 1baccf7 into kubevirt:master Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants