Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Document handling eviction requests #11286

Merged
merged 1 commit into from May 8, 2024

Conversation

orelmisan
Copy link
Member

What this PR does

Before this PR:
One needs to dig through a lot of code in order to understand how KubeVirt handles K8s' API-initiated eviction [1].

After this PR:
The way KubeVirt handles K8s' API-initiated eviction is documented.

[1] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Feb 18, 2024
@orelmisan
Copy link
Member Author

/cc @EdDev

Copy link
Member

@dankenigsberg dankenigsberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for enlightening me. Would you elaborate a bit more?

| External | N/A | True | True | False | 429 - Eviction blocked by PDB |

To summarize:
1. The eviction request is granted only if both the webhook and the PDB allow them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two mechanisms? Why pdb is not enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PDB just blocks the eviction request. I think it is still in-use as a precaution in case the webhook is unavailable, since the webhook is configured to be ignored in case of failure (failurePolicy: Ignore)[1].

The webhook marks the VMI for eviction.

[1] https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#failure-policy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you explain why two different mechanisms are needed? What is the problems and benefits of each of them? Please include the answer in the text of this documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the note under Pod Eviction Webhook, it should answer this question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

1. The eviction request is granted only if both the webhook and the PDB allow them.
2. The eviction request's initiator might get a 429 response, but the VMI will be migrated in the background.

# Evacuation Controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using two terms? Is there a difference between Evacuation and Eviction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Evacuation Controller was introduced by #2091.
Initially, this mechanism was initiated using a taint.
Support for API initiated evictions came later #4012.
Maybe its a good time to revisit it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your answer. Is there a difference between "Evacuation" and "Eviction"? If so, would you define each? If no, can we use one term for both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Evacuation controller is AFAIU historic.
It is also present in VMI.status.evacuationNodeName.

In KubeVirt CR spec.configuration, we have evictionStrategy.

So it seems that unfortunately there are two names which refer to the same process.
IMO this naming issue deserves a dedicated discussion.
WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this doc should provide the explanation you just gave.

Copy link
Member Author

@orelmisan orelmisan Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After another read of the PR description of PR #2091, it seems that both terms are there from day one.

|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| None | No action will be taken, according to the specified 'RunStrategy' the VirtualMachine will be restarted or shutdown |
| LiveMigrate | The VirtualMachine will be migrated instead of being shutdown |
| LiveMigrateIfPossible | Same as `LiveMigrate` but only if the VirtualMachine is Live-Migratable, otherwise it will behave as `None` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please define when a machine is Live-Migratable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditions which define whether a VM is live-migratable are defined by the virt-handler.
They are pretty complex and IMO listing them will make this document less readable.
Nevertheless, I do think there is a place to document them in another document by a future PR.

In case the pod is not a `virt-launcher` pod - the eviction request is approved.
Otherwise, the webhook takes action according to the VMI's eviction strategy:

| Eviction Strategy | Is VMI migratable | Is VMI marked for eviction | Does Webhook approve eviction | Response |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first three columns are a condition, the last two are a result? I,e, if (eviction strategy, is VM migratable, is VMI marked for eviction) is true => (True|False, response code)?. If so, can you mention this in the table description/legend somewhere?

N/A for Is VMI migratable column means arbitrary value? I.e. both True and False apply? If so, it might be better to use "Either of True or False" or "True/False" or similar notation to make the value interpretation easier. Or, N/A means "does not apply"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Eviction strategy and Whether the VMI is migratable are the inputs.
The Is VMI marked for eviction, Does Webhook approve eviction and Response columns are the result.
I've added an explanation, please tell me if it is more readable now.

Changed N/A to True/False.
N/A is not applicable.

# Pod Distribution Budget
In order for the eviction of VMIs to happen in a controlled manner, KubeVirt protects part of the `virt-launcher` pods with a PDB which blocks eviction requests.

`virt-controller` has a `Disruption Budget Controller` which decides whether a `virt-launcher` pod should be protected based on the eviction strategy of its controlling VMI:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the controller responsible for rendering PDBs for virt-launcher pods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

I also added a note saying that the migration controller also expand this PDB to both source and target pods during migration.

The eviction request's initiator will observe one of the following responses:

| Eviction Strategy | Is VMI migratable | Is VMI marked for eviction | Does Webhook approve eviction | Does PDB allow eviction | Response |
|-----------------------|-------------------|----------------------------|-------------------------------|-------------------------|----------------------------------|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should be the table read? Either as

  • If "Eviction strategy None" then ("Is VMI migratable" does not apply, "VM is not marked for eviction", "Webhook approves eviction", ...)

or

  • If ("Eviction strategy" == None, "Is VMI migratable" can be of any value, "Is VMI marked for eviction" == false, "Does Webhook approve eviction" == true, "Does PDB allow eviction" == true) then Response

or differently?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the eviction strategy and is VMI migratable - the webhook decides what to do.
Both the webhook and the PDB should approve the eviction in order to receive 200, otherwise 429 is returned.

@orelmisan orelmisan force-pushed the handling-eviction-req-doc branch 2 times, most recently from 42ede1f to 7ab39f6 Compare February 26, 2024 09:45
@orelmisan
Copy link
Member Author

Change: Answered @ingvagabund review comments.

@ingvagabund
Copy link

Looks good. Thank you for addressing the comments.

@brianmcarey
Copy link
Member

/test pull-kubevirt-e2e-k8s-1.29-sig-storage

@orelmisan
Copy link
Member Author

Change: Answered @dankenigsberg comments.

orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Mar 18, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request, and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Mar 18, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Mar 18, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
| None | No action will be taken, according to the specified 'RunStrategy' the VirtualMachine will be restarted or shutdown |
| LiveMigrate | The VirtualMachine will be migrated instead of being shutdown |
| LiveMigrateIfPossible | Same as `LiveMigrate` but only if the VirtualMachine is Live-Migratable, otherwise it will behave as `None` |
| External | The VirtualMachine will be protected by a PDB and vmi.Status.EvacuationNodeName will be set on eviction. This is mainly useful for cluster-api-provider-kubevirt (capk) which needs a way for VMI’s to be blocked from eviction, yet signal capk that eviction has been called on the VMI so the capk controller can handle tearing the VMI down |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PDB plays the same role as in other strategies, no?

What is the value that is set on vmi.Status.EvacuationNodeName ?

If I understand correctly, I'd say: On eviction request vmi.Status.EvacuationNodeName is set to ???. An external party (such as cluster-api-provider-kubevirt) is responsible for the actual evacuation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The descriptions in this table were copied from here

PDB plays the same role as in other strategies, no?

Yes, it does

What is the value that is set on vmi.Status.EvacuationNodeName ?

The node name on which the virt-launcher is currently scheduled on - and has to be evacuated from.

Copy link
Member Author

@orelmisan orelmisan Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to perform the change both in the code and in this doc in a follow-up.
WDYT?

| External | The VirtualMachine will be protected by a PDB and vmi.Status.EvacuationNodeName will be set on eviction. This is mainly useful for cluster-api-provider-kubevirt (capk) which needs a way for VMI’s to be blocked from eviction, yet signal capk that eviction has been called on the VMI so the capk controller can handle tearing the VMI down |

# Pod Eviction Webhook
`virt-api` serves a [validating webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) which intercepts **all** eviction requests in the cluster:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "all" in boldface? I believe that this is a sad implementation detail, not something we want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO its something that is important to emphasize - this webhook sees all eviction requests, and there is currently no way of filtering just virt-launchers.


The webhook admits eviction requests **before** `kube-api` checks them against [Pod Distribution Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) objects.

In case the pod is not a `virt-launcher` pod - the eviction request is approved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a serious problem? What if there's another application in the cluster that wants to have fancy evacuation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, there could be multiple webhooks (called in a random order).
Approving the request does not mean the end of the process - there could be additional webhooks and a PDB.

There is also a KEP about this subject.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to rephrase to something like "the eviction request is passed back to chain"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

| External | N/A | True | True | False | 429 - Eviction blocked by PDB |

To summarize:
1. The eviction request is granted only if both the webhook and the PDB allow them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Mar 19, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Apr 3, 2024
Currently, it is not possible to filter eviction requests
based on pod labels [1][2].

For this reason, the pod eviction admitter has to catch all
eviction requests in the cluster and only handle eviction
requests on virt-launcher pods.

Eviction requests on non virt-launcher pods should be approved.

Since the existing tests have global variables set by
a global BeforeEach function, it is difficult to refactor
in place.

Create a new Describe clause in order to hold the refactored
test.

Add new helper functions.

For additional details please see [3].

[1] kubernetes/kubernetes#110169 (comment)
[2] https://kubernetes.slack.com/archives/C0EG7JC6T/p1707054818877809
[3] kubevirt#11286

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Apr 3, 2024
The main job of the pod eviction admitter is triggering
VMI evacuation.

This is done by patching the VMI's `Status.EvacuationNodeName`
to the node name it is currently scheduled on.

The evacuation request will be approved by the admitter
but will eventually fail because of a PodDistributionBudget
protecting the virt-launcher pod.

Create a new test to cover all the possible scenarios by setting:
- Cluster-wide eviction strategy
- VMI eviction strategy
- VMI migratable status

Remove old tests.

For additional details please see [1].

[1] kubevirt#11286

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Apr 3, 2024
In some scenarios, VMI evacuation cannot be triggered.

The evacuation request will be approved by the admitter
but will eventually it might fail because of a PodDistributionBudget
protecting the virt-launcher pod.

Create a new test to cover all the possible scenarios by setting:
- Cluster-wide eviction strategy
- VMI eviction strategy
- VMI migratable status

Remove old tests.

For additional details please see [1].

[1] kubevirt#11286

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Apr 30, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request Apr 30, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
@orelmisan
Copy link
Member Author

@dankenigsberg what do you think is missing in order to move forward with the PR?

@dankenigsberg
Copy link
Member

dankenigsberg what do you think is missing in order to move forward with the PR?

I think it is mergeable, but I'm no expert about the subject matter - I cannot really tell if it is all true, and I was not sure I'm qualified to give it a

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2024
@orelmisan
Copy link
Member Author

@xpivarc could you please take a look?

Copy link
Member

@xpivarc xpivarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/hold


The webhook admits eviction requests **before** `kube-api` checks them against [Pod Distribution Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) objects.

In case the pod is not a `virt-launcher` pod - the eviction request is approved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to rephrase to something like "the eviction request is passed back to chain"

@kubevirt-bot kubevirt-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 2, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator
/test pull-kubevirt-e2e-k8s-1.29-sig-network
/test pull-kubevirt-e2e-k8s-1.29-sig-storage
/test pull-kubevirt-e2e-k8s-1.29-sig-compute
/test pull-kubevirt-e2e-k8s-1.29-sig-operator

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label May 2, 2024
@orelmisan
Copy link
Member Author

Change: Answered @xpivarc's comment.

@xpivarc
Copy link
Member

xpivarc commented May 2, 2024

/hold cancel

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 2, 2024
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request May 7, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
Document the way KubeVirt handles K8s' API-initiated eviction [1].

[1] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/

Signed-off-by: Orel Misan <omisan@redhat.com>
@orelmisan
Copy link
Member Author

Change: Fixed typos
Thank you @enp0s3

Copy link
Contributor

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enp0s3, xpivarc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label May 8, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.30-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator
/test pull-kubevirt-e2e-k8s-1.29-sig-network
/test pull-kubevirt-e2e-k8s-1.29-sig-storage
/test pull-kubevirt-e2e-k8s-1.29-sig-compute
/test pull-kubevirt-e2e-k8s-1.29-sig-operator

@kubevirt-bot kubevirt-bot merged commit be98de8 into kubevirt:main May 8, 2024
37 checks passed
@orelmisan orelmisan deleted the handling-eviction-req-doc branch May 8, 2024 14:06
orelmisan added a commit to orelmisan/kubevirt that referenced this pull request May 9, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
Sreeja1725 pushed a commit to Sreeja1725/kubevirt that referenced this pull request May 17, 2024
Currently, when the criteria for VMI evacuation are met [1]:
1. The eviction admitter marks the VMI for evacuation.
2. The eviction admitter approves the eviction request.
3. kube-api denies the eviction request because the virt-launcher is
protected by a PDB [2].

When a descheduler[3] tries to evict a virt-launcher pod
owned by a VMI that could be evacuated,
it receives a denial for the eviction request and cannot understand
that a VMI evacuation was triggered in the background.

In order to support descheduler integration [4], deny the eviction
request with a spacial message when the VMI is marked for evacuation.
This is done in order to:
1. Signal the descheduler that an evacuation was triggered although
the eviction was denied.
2. Use the existing K8s eviction API.

In this scenario, kube-api will not check whether the virt-launcher is
protected by a PDB.

Note: In case the Eviction admitter is down - kube-api will treat the
eviction request as if it was approved by the admitter, and will continue to
check whether a PDB protects the virt-launcher.
In this case the VMI will not be marked for evacuation and the
eviction initiator receive a regular denial response because of the PDB.

[1] kubevirt#11286
[2] https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/#how-api-initiated-eviction-works
[3] https://github.com/kubernetes-sigs/descheduler
[4] kubevirt/community#258

Signed-off-by: Orel Misan <omisan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants