Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dedicated CPUs on migration #6821

Merged
merged 13 commits into from Jan 6, 2022
Merged

Conversation

jean-edouard
Copy link
Contributor

@jean-edouard jean-edouard commented Nov 19, 2021

What this PR does / why we need it:
This PR patches a domain of a VMI that requires dedicated CPUs with the new CPU set dedicated for it on the target node
This is the continuation of the work started by #6200

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:
This PR moves a lot of code around. I understand that it complicates reviews, and I'd be willing to split off the refactoring bits into a separate PR.
I didn't do it right away because it would be a fair amount of work and pure-refactoring PRs tend to be disliked.

Release note:

Migrating VMIs that contain dedicated CPUs will now have properly dedicated CPUs on target

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Nov 19, 2021
@rmohr
Copy link
Member

rmohr commented Nov 20, 2021

/retest

@rmohr
Copy link
Member

rmohr commented Nov 20, 2021

@jean-edouard seems like a make generate and make is missing.

@jean-edouard jean-edouard force-pushed the pintarget branch 3 times, most recently from ecbcdc5 to 25adb69 Compare November 22, 2021 22:34
pkg/virt-handler/vm.go Outdated Show resolved Hide resolved
pkg/virt-handler/vm.go Outdated Show resolved Hide resolved
DedicatedCPUPlacement: true,
}
migratableVMI.Spec.Domain.Resources.Requests[k8sv1.ResourceMemory] = resource.MustParse("512Mi")
// TODO? Other tests use only 128Mi + the following but my test cluster doesn't do hugepages apparently
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

128Mi is a default in KubeVirtCI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but a VMI with dedicated cores, 128Mi of ram and no hugepages get OOM killed...
Since I can't do hugepages I bumped the RAM and it did the trick.

pkg/virt-launcher/virtwrap/live-migration-source.go Outdated Show resolved Hide resolved
@vladikr
Copy link
Member

vladikr commented Nov 23, 2021

@jean-edouard can't we combine numa_placement.go, vcpu.go, vcpu_placement.go into a single file?
Functions in numa_placement.go are not significant to NUMA and just in general this is all related to the placement of vCPUs.

@jean-edouard jean-edouard force-pushed the pintarget branch 2 times, most recently from f370cb2 to 3eef5e5 Compare November 23, 2021 21:43
@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

1 similar comment
@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

@jean-edouard jean-edouard changed the title [WIP] Update dedicated CPUs on migration Update dedicated CPUs on migration Nov 24, 2021
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 24, 2021
@jean-edouard
Copy link
Contributor Author

omeryahud and others added 13 commits January 4, 2022 15:24
During a migration process of a VMI that requires dedicated CPUs,
this field will be populated by the target node's virt-handler,
and will be consumed by the source virt-launcher to update the VMI's
dedicated CPU set pre-migration.

Signed-off-by: Omer Yahud <oyahud@redhat.com>
For a VMI that requires dedicated CPU, virt-handler will now report
the dedicated CPU set that was assigned for that VM on the target node.

Signed-off-by: Omer Yahud <oyahud@redhat.com>
virt-launcher will now read the new CPU set of the VMI on the target node
and patch the domain prior to migration

Signed-off-by: Omer Yahud <oyahud@redhat.com>
Signed-off-by: Omer Yahud <oyahud@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
…ter XML accordingly

Signed-off-by: Jed Lejosne <jed@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
And add missing error checks
And move all vcpu code to vcpu.go

Signed-off-by: Jed Lejosne <jed@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
Signed-off-by: Jed Lejosne <jed@redhat.com>
…rom scratch

Signed-off-by: Jed Lejosne <jed@redhat.com>
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 4, 2022
@jean-edouard
Copy link
Contributor Author

Rebased against main to fix conflict and addressed comments from last review

@jean-edouard
Copy link
Contributor Author

/retest

@jean-edouard
Copy link
Contributor Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations

@vasiliy-ul
Copy link
Contributor

Rebased against main to fix conflict and addressed comments from last review

Currently pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations still fails:

Tests Suite: [Serial][rfe_id:393][crit:high][vendor:cnv-qe@redhat.com][level:system][sig-compute] VM Live Migration Starting a VirtualMachineInstance live migration cancelation should be able to cancel a migration [sig-storage][test_id:2226] with ContainerDisk

Though this seems to be another issue (not related to this PR). I also hit that failure. Tried to fix it in #6997.

/lgtm
/retest

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2022
@jean-edouard
Copy link
Contributor Author

Currently pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations still fails:

Tests Suite: [Serial][rfe_id:393][crit:high][vendor:cnv-qe@redhat.com][level:system][sig-compute] VM Live Migration Starting a VirtualMachineInstance live migration cancelation should be able to cancel a migration [sig-storage][test_id:2226] with ContainerDisk

Though this seems to be another issue (not related to this PR). I also hit that failure. Tried to fix it in #6997.

Thank you @vasiliy-ul for looking at fixing the flaky test, I'll review your PR!
I believe you put this PR on hold because of it, is it ok to unhold?

@vasiliy-ul
Copy link
Contributor

I believe you put this PR on hold because of it, is it ok to unhold?

Ooops, forgot to unhold.

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 5, 2022
@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Jan 5, 2022

@jean-edouard: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-check-tests-for-flakes f22c592 link false /test pull-kubevirt-check-tests-for-flakes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-commenter-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit f08d00e into kubevirt:main Jan 6, 2022
@stu-gott
Copy link
Member

stu-gott commented Jan 7, 2022

/cherry-pick release-0.49

@kubevirt-bot
Copy link
Contributor

@stu-gott: new pull request created: #7043

In response to this:

/cherry-pick release-0.49

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants