New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workload-update: cancel a workload update migration #11641
base: main
Are you sure you want to change the base?
Conversation
dd69c95
to
4c764b1
Compare
The volume migration PR is now based on the abortion mechanism implemented in this PR. The volume migration PR also includes a functional test that verifies that the cancellation properly work and it can be used as an example of this feature to work. |
if migration.IsFinal() { | ||
continue | ||
} | ||
if migration.Namespace != ns { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't/shouldn't this ionformer have a "namespace" index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also add your own "vmi" index that indexes on namespace/name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhenriks I'm not sure I understand the feedback, could you please expand?
4c764b1
to
ca19ee8
Compare
ca19ee8
to
b21473d
Compare
This annotation enables the deletion of a migration due to a workload update of the VMI. The main use of this annotation is to test the abortion of an update. The users should also remove manually this annotation once the update has been aborted. Signed-off-by: Alice Frosi <afrosi@redhat.com>
The function ListWorkloadUpdateMigrations is an helper function that returns the not finalized migrations due to a workload update. Signed-off-by: Alice Frosi <afrosi@redhat.com>
9d7fc9c
to
c18c24f
Compare
The testWorkloadUpdateMigrationAbortionAnnotation cancels the migrations due to a workload update. If a VMI has this annotation, new changes shouldn't be considered until this annotation is removed. Signed-off-by: Alice Frosi <afrosi@redhat.com>
c18c24f
to
bd47504
Compare
/test pull-kubevirt-e2e-k8s-1.29-sig-compute-migrations |
/test pull-kubevirt-e2e-k8s-1.29-sig-operator |
For certain workload updates, we might desire to cancel a migration triggred by the update. The workload updater checks if the VMI has a VirtualMachineInstance*Change, if the conditions aren't present but there is a migration associated with an automated update, then the migration will be aborted. Signed-off-by: Alice Frosi <afrosi@redhat.com>
Signed-off-by: Alice Frosi <afrosi@redhat.com>
bd47504
to
8ce4bd5
Compare
/test pull-kubevirt-e2e-k8s-1.29-sig-operato |
@alicefr: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Need to fix some test failure in the operator suite |
/test pull-kubevirt-e2e-k8s-1.29-sig-operator |
The tests for the operator now pass locally. |
/test pull-kubevirt-e2e-k8s-1.29-sig-operator |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: acardace The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add functional test to verify the abortion of the migration during the CPU update. The migration cancellation is triggered by the annotation kubevirt.io/testWorkloadUpdateMigrationAbortion on the VMI. Signed-off-by: Alice Frosi <afrosi@redhat.com>
8ce4bd5
to
4dfde6f
Compare
Fixed last nit in the tests and a typo |
@alicefr: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
if isHotplugInProgress(vmi) { | ||
return true | ||
} | ||
|
||
return false | ||
} | ||
|
||
func (c *WorkloadUpdateController) shouldAbortMigration(vmi *virtv1.VirtualMachineInstance) bool { | ||
numMig := len(migrationutils.ListWorkloadUpdateMigrations(c.migrationInformer.GetStore(), vmi.Name, vmi.Namespace)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth ensuring that we are only aborting the migration this controller started?
_, ok = migration.Annotations[virtv1.WorkloadUpdateMigrationAnnotation]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already the case. The function ListWorkloadUpdateMigrations
checks if the migration object has the annotation WorkloadUpdateMigrationAnnotation
(see here), otherwise the migration isn't included in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it cleaner encapsulated in the function, and the function it is also re-used below
What this PR does
Before this PR:
The workload updater doesn't currently support the ability to abort a migration due to an automated workload update. Today, the workload update is mostly used for CPU and memory hotplug.
After this PR:
This change introduces a mechanism to aborts a workload update.
The migration abortion can be detected if the VMI doesn't have any VMI*Change conditions but there is a migration due to a automated workload update. If this condition is met, then it means that the change condition was present (has triggered the migration) but now it has been removed. This assumption simplifies the workload updater that needs only to take care of deleting the migration.
We have also introduced a new annotation kubevirt.io/kubevirt.io/testWorkloadUpdateMigrationAbortion that should only be used during the tests.
Special notes for your reviewer
This feature will be needed for volume Migration. Volume migration still relays on the same workload update as CPU and memory hotplug. However, in this case, the migration might take very long and we want to allow user to be able to cancel the change (for more details please check the design proposal). A draft PR for volume migration can be found in #11533
This PR is a pre-requirement to introduce such as cancellation mechanism. Since the volume migration PR is complex and this is a generic mechanism, I'd like to discuss it and present it separately, even if this still not used in the code.
Release note