-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage collect finalized Migration objects #7144
Garbage collect finalized Migration objects #7144
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need expectations here to avoid failing subsequent deletes. Especially because we are potentially deleting migrations which are not triggering the controller loop in the first place.
7023ff2
to
d271278
Compare
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
Signed-off-by: David Vossel <davidvossel@gmail.com>
d271278
to
fada881
Compare
/retest |
1 similar comment
/retest |
By("Starting the VirtualMachineInstance") | ||
vmi = runVMIAndExpectLaunch(vmi, 240) | ||
|
||
for i := 0; i < 10; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the migration buffer size configurable? perhaps 2x size should be used here vs 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the buffer size isn't configurable right now. I just set it to a default of 5. The only reason i wanted a buffer at all is so we have some objects to use for debugging.
|
||
migrations, err := virtClient.VirtualMachineInstanceMigration(vmi.Namespace).List(&metav1.ListOptions{}) | ||
Expect(err).To(BeNil()) | ||
Expect(migrations.Items).To(HaveLen(5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will all buffer lengths on all clusters be 5? Should this also reflect cluster config's setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i hard coded the buffer length for now in the migration controller. I'd only want to expose it in a config if we have a reason to someone to modify it.
} | ||
|
||
// only keep the oldest 5 finalized migration objects | ||
garbageCollectionCount := len(finalizedMigrations) - defaultFinalizedMigrationGarbageCollectionBuffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something a cluster would need to realistically configure? Is 5 universally reasonable? If it's going to stay hard coded is calling it the "default" misleading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the buffer value likely won't ever need to be configurable. This value is "per vmi" so we have a trail of the last 5 finalized migration objects per vmi that can be used for debug purposes.
Looks good overall. Added a few minor questions. |
/lgtm Deferring approval to @rmohr |
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rmohr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/cherry-pick release-0.49 |
@stu-gott: once the present PR merges, I will cherry-pick it on top of release-0.49 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@davidvossel: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@stu-gott: new pull request created: #7166 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
related to https://bugzilla.redhat.com/show_bug.cgi?id=2021992
With the addition of the
workload updates
api, VMIs will automatically get migrated after a kubevirt update in order to run on the latest virt-launcher pod.This automation is eventually consistent, and will continually attempt to migrate VMIs until all "migratable" VMIs are running on new virt-launcher pods. In the even that a VMI continually fails to live migrate, the number of Migration objects will grow indefinitely as the system continually tries to move the VMI.
To avoid allowing the finalized migrations for a VMI grow indefinitely, this PR garbage collects all but the most recent 5 migration objects. This means we continue to have a buffer of migration objects which we can use to debug issues, while not having an indefinitely large list of migration objects weight down the system.