New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: Document handling eviction requests #11286
Docs: Document handling eviction requests #11286
Conversation
/cc @EdDev |
6edd986
to
7e0ec6d
Compare
7e0ec6d
to
4361c3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for enlightening me. Would you elaborate a bit more?
| External | N/A | True | True | False | 429 - Eviction blocked by PDB | | ||
|
||
To summarize: | ||
1. The eviction request is granted only if both the webhook and the PDB allow them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need two mechanisms? Why pdb is not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PDB just blocks the eviction request. I think it is still in-use as a precaution in case the webhook is unavailable, since the webhook is configured to be ignored in case of failure (failurePolicy: Ignore)[1].
The webhook marks the VMI for eviction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you explain why two different mechanisms are needed? What is the problems and benefits of each of them? Please include the answer in the text of this documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the note under Pod Eviction Webhook
, it should answer this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
1. The eviction request is granted only if both the webhook and the PDB allow them. | ||
2. The eviction request's initiator might get a 429 response, but the VMI will be migrated in the background. | ||
|
||
# Evacuation Controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using two terms? Is there a difference between Evacuation and Eviction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand your answer. Is there a difference between "Evacuation" and "Eviction"? If so, would you define each? If no, can we use one term for both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name Evacuation controller
is AFAIU historic.
It is also present in VMI.status.evacuationNodeName.
In KubeVirt CR spec.configuration, we have evictionStrategy
.
So it seems that unfortunately there are two names which refer to the same process.
IMO this naming issue deserves a dedicated discussion.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this doc should provide the explanation you just gave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After another read of the PR description of PR #2091, it seems that both terms are there from day one.
|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| None | No action will be taken, according to the specified 'RunStrategy' the VirtualMachine will be restarted or shutdown | | ||
| LiveMigrate | The VirtualMachine will be migrated instead of being shutdown | | ||
| LiveMigrateIfPossible | Same as `LiveMigrate` but only if the VirtualMachine is Live-Migratable, otherwise it will behave as `None` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please define when a machine is Live-Migratable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditions which define whether a VM is live-migratable are defined by the virt-handler.
They are pretty complex and IMO listing them will make this document less readable.
Nevertheless, I do think there is a place to document them in another document by a future PR.
docs/handling-eviction-requests.md
Outdated
In case the pod is not a `virt-launcher` pod - the eviction request is approved. | ||
Otherwise, the webhook takes action according to the VMI's eviction strategy: | ||
|
||
| Eviction Strategy | Is VMI migratable | Is VMI marked for eviction | Does Webhook approve eviction | Response | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first three columns are a condition, the last two are a result? I,e, if (eviction strategy, is VM migratable, is VMI marked for eviction) is true => (True|False, response code)?. If so, can you mention this in the table description/legend somewhere?
N/A
for Is VMI migratable
column means arbitrary value? I.e. both True and False apply? If so, it might be better to use "Either of True or False" or "True/False" or similar notation to make the value interpretation easier. Or, N/A
means "does not apply"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Eviction strategy and Whether the VMI is migratable are the inputs.
The Is VMI marked for eviction
, Does Webhook approve eviction
and Response
columns are the result.
I've added an explanation, please tell me if it is more readable now.
Changed N/A
to True/False
.
N/A is not applicable.
# Pod Distribution Budget | ||
In order for the eviction of VMIs to happen in a controlled manner, KubeVirt protects part of the `virt-launcher` pods with a PDB which blocks eviction requests. | ||
|
||
`virt-controller` has a `Disruption Budget Controller` which decides whether a `virt-launcher` pod should be protected based on the eviction strategy of its controlling VMI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the controller responsible for rendering PDBs for virt-launcher
pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I also added a note saying that the migration controller also expand this PDB to both source and target pods during migration.
The eviction request's initiator will observe one of the following responses: | ||
|
||
| Eviction Strategy | Is VMI migratable | Is VMI marked for eviction | Does Webhook approve eviction | Does PDB allow eviction | Response | | ||
|-----------------------|-------------------|----------------------------|-------------------------------|-------------------------|----------------------------------| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should be the table read? Either as
- If "Eviction strategy None" then ("Is VMI migratable" does not apply, "VM is not marked for eviction", "Webhook approves eviction", ...)
or
- If ("Eviction strategy" == None, "Is VMI migratable" can be of any value, "Is VMI marked for eviction" == false, "Does Webhook approve eviction" == true, "Does PDB allow eviction" == true) then Response
or differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the eviction strategy and is VMI migratable - the webhook decides what to do.
Both the webhook and the PDB should approve the eviction in order to receive 200, otherwise 429 is returned.
42ede1f
to
7ab39f6
Compare
Change: Answered @ingvagabund review comments. |
Looks good. Thank you for addressing the comments. |
/test pull-kubevirt-e2e-k8s-1.29-sig-storage |
7ab39f6
to
70cb759
Compare
Change: Answered @dankenigsberg comments. |
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request, and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
| None | No action will be taken, according to the specified 'RunStrategy' the VirtualMachine will be restarted or shutdown | | ||
| LiveMigrate | The VirtualMachine will be migrated instead of being shutdown | | ||
| LiveMigrateIfPossible | Same as `LiveMigrate` but only if the VirtualMachine is Live-Migratable, otherwise it will behave as `None` | | ||
| External | The VirtualMachine will be protected by a PDB and vmi.Status.EvacuationNodeName will be set on eviction. This is mainly useful for cluster-api-provider-kubevirt (capk) which needs a way for VMI’s to be blocked from eviction, yet signal capk that eviction has been called on the VMI so the capk controller can handle tearing the VMI down | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PDB plays the same role as in other strategies, no?
What is the value that is set on vmi.Status.EvacuationNodeName
?
If I understand correctly, I'd say: On eviction request vmi.Status.EvacuationNodeName
is set to ???. An external party (such as cluster-api-provider-kubevirt) is responsible for the actual evacuation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The descriptions in this table were copied from here
PDB plays the same role as in other strategies, no?
Yes, it does
What is the value that is set on vmi.Status.EvacuationNodeName ?
The node name on which the virt-launcher is currently scheduled on - and has to be evacuated from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to perform the change both in the code and in this doc in a follow-up.
WDYT?
| External | The VirtualMachine will be protected by a PDB and vmi.Status.EvacuationNodeName will be set on eviction. This is mainly useful for cluster-api-provider-kubevirt (capk) which needs a way for VMI’s to be blocked from eviction, yet signal capk that eviction has been called on the VMI so the capk controller can handle tearing the VMI down | | ||
|
||
# Pod Eviction Webhook | ||
`virt-api` serves a [validating webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) which intercepts **all** eviction requests in the cluster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "all" in boldface? I believe that this is a sad implementation detail, not something we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO its something that is important to emphasize - this webhook sees all eviction requests, and there is currently no way of filtering just virt-launchers.
|
||
The webhook admits eviction requests **before** `kube-api` checks them against [Pod Distribution Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) objects. | ||
|
||
In case the pod is not a `virt-launcher` pod - the eviction request is approved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it a serious problem? What if there's another application in the cluster that wants to have fancy evacuation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, there could be multiple webhooks (called in a random order).
Approving the request does not mean the end of the process - there could be additional webhooks and a PDB.
There is also a KEP about this subject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you want to rephrase to something like "the eviction request is passed back to chain"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| External | N/A | True | True | False | 429 - Eviction blocked by PDB | | ||
|
||
To summarize: | ||
1. The eviction request is granted only if both the webhook and the PDB allow them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, it is not possible to filter eviction requests based on pod labels [1][2]. For this reason, the pod eviction admitter has to catch all eviction requests in the cluster and only handle eviction requests on virt-launcher pods. Eviction requests on non virt-launcher pods should be approved. Since the existing tests have global variables set by a global BeforeEach function, it is difficult to refactor in place. Create a new Describe clause in order to hold the refactored test. Add new helper functions. For additional details please see [3]. [1] kubernetes/kubernetes#110169 (comment) [2] https://kubernetes.slack.com/archives/C0EG7JC6T/p1707054818877809 [3] kubevirt#11286 Signed-off-by: Orel Misan <omisan@redhat.com>
The main job of the pod eviction admitter is triggering VMI evacuation. This is done by patching the VMI's `Status.EvacuationNodeName` to the node name it is currently scheduled on. The evacuation request will be approved by the admitter but will eventually fail because of a PodDistributionBudget protecting the virt-launcher pod. Create a new test to cover all the possible scenarios by setting: - Cluster-wide eviction strategy - VMI eviction strategy - VMI migratable status Remove old tests. For additional details please see [1]. [1] kubevirt#11286 Signed-off-by: Orel Misan <omisan@redhat.com>
In some scenarios, VMI evacuation cannot be triggered. The evacuation request will be approved by the admitter but will eventually it might fail because of a PodDistributionBudget protecting the virt-launcher pod. Create a new test to cover all the possible scenarios by setting: - Cluster-wide eviction strategy - VMI eviction strategy - VMI migratable status Remove old tests. For additional details please see [1]. [1] kubevirt#11286 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
@dankenigsberg what do you think is missing in order to move forward with the PR? |
I think it is mergeable, but I'm no expert about the subject matter - I cannot really tell if it is all true, and I was not sure I'm qualified to give it a /lgtm |
@xpivarc could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/hold
|
||
The webhook admits eviction requests **before** `kube-api` checks them against [Pod Distribution Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) objects. | ||
|
||
In case the pod is not a `virt-launcher` pod - the eviction request is approved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you want to rephrase to something like "the eviction request is passed back to chain"
Required labels detected, running phase 2 presubmits: |
70cb759
to
9c705b2
Compare
/hold cancel |
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Document the way KubeVirt handles K8s' API-initiated eviction [1]. [1] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/ Signed-off-by: Orel Misan <omisan@redhat.com>
9c705b2
to
4a93ff5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enp0s3, xpivarc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Required labels detected, running phase 2 presubmits: |
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
Currently, when the criteria for VMI evacuation are met [1]: 1. The eviction admitter marks the VMI for evacuation. 2. The eviction admitter approves the eviction request. 3. kube-api denies the eviction request because the virt-launcher is protected by a PDB [2]. When a descheduler[3] tries to evict a virt-launcher pod owned by a VMI that could be evacuated, it receives a denial for the eviction request and cannot understand that a VMI evacuation was triggered in the background. In order to support descheduler integration [4], deny the eviction request with a spacial message when the VMI is marked for evacuation. This is done in order to: 1. Signal the descheduler that an evacuation was triggered although the eviction was denied. 2. Use the existing K8s eviction API. In this scenario, kube-api will not check whether the virt-launcher is protected by a PDB. Note: In case the Eviction admitter is down - kube-api will treat the eviction request as if it was approved by the admitter, and will continue to check whether a PDB protects the virt-launcher. In this case the VMI will not be marked for evacuation and the eviction initiator receive a regular denial response because of the PDB. [1] kubevirt#11286 [2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works [3] https://github.com/kubernetes-sigs/descheduler [4] kubevirt/community#258 Signed-off-by: Orel Misan <omisan@redhat.com>
What this PR does
Before this PR:
One needs to dig through a lot of code in order to understand how KubeVirt handles K8s' API-initiated eviction [1].
After this PR:
The way KubeVirt handles K8s' API-initiated eviction is documented.
[1] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/
Fixes #
Why we need it and why it was done in this way
The following tradeoffs were made:
The following alternatives were considered:
Links to places where the discussion took place:
Special notes for your reviewer
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
Release note