Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[virt-controller]: consider the ParallelOutboundMigrationsPerNode when evicting VMs #8701

Merged
merged 2 commits into from Nov 17, 2022

Conversation

enp0s3
Copy link
Contributor

@enp0s3 enp0s3 commented Oct 31, 2022

Signed-off-by: Igor Bezukh ibezukh@redhat.com

What this PR does / why we need it:
If not considering the maximum per-source-node active migrations we can have a scenario where we reach the maximum capacity of concurrent migrations in a cluster, where most of the migrations are pending because all the migrations are from the same source node.

This can be problematic in a case where multiple drains occur at the same time, therefore evictions from different source nodes can start simultaneously, but it won't happen because the migraiton queue was occupied with VMs from the same source node.

The solution is to observe migration queue with active migrations from same source node and limit the creation of new migrations up to the ParallelOutboundMigrationsPerNode.

Which issue(s) this PR fixes
https://bugzilla.redhat.com/show_bug.cgi?id=2069098

Special notes for your reviewer:
In this PR I'm not aiming to re-factor or improve the current eviction flow, but only concentrate on the fix.

Release note:

Consider the ParallelOutboundMigrationsPerNode when evicting VMs

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Oct 31, 2022
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 2, 2022

/cc @acardace @iholder-redhat Hi, can you please take a look?

Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @enp0s3! Thanks very much!
Left some mostly minor comments

@bbenshab
Copy link

bbenshab commented Nov 7, 2022

Just FYI, I tested a temporary walkaround for this issue by setting
MaxParallelMigrationsPerCluster == MaxParallelMigrationsPerOutboundNode
I hoped that when this happens we would drain all the VMs from one node at a time and then we can hit max VMs migration in parallel, unfortunately, it didn't work.
so I'm really looking forward to testing this fix.

@iholder101
Copy link
Contributor

Just FYI, I tested a temporary walkaround for this issue by setting MaxParallelMigrationsPerCluster == MaxParallelMigrationsPerOutboundNode I hoped that when this happens we would drain all the VMs from one node at a time and then we can hit max VMs migration in parallel, unfortunately, it didn't work. so I'm really looking forward to testing this fix.

Hey!
This makes sense. With the current behavior the first node that would try to evacuate would create MaxParallelMigrationsPerCluster vmim objects, therefore it would be the only one migrating (at least at the beginning). I believe @enp0s3's changes address that problem 🤞

@enp0s3 enp0s3 force-pushed the max-mig branch 2 times, most recently from ff0496c to 6116051 Compare November 7, 2022 14:47
@iholder101
Copy link
Contributor

Thanks a lot @enp0s3! Great job!
/lgtm

@xpivarc PTAL

@kubevirt-bot kubevirt-bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Nov 7, 2022
@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 7, 2022
@enp0s3 enp0s3 marked this pull request as draft November 13, 2022 15:36
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 13, 2022
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 14, 2022
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 14, 2022

@xpivarc Hey. I think I addressed everything, can you PTAL?

Copy link
Member

@xpivarc xpivarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
I have just a few suggestions to improve this.

if len(migrationCandidates) > 0 || len(nonMigrateable) > 0 {

// No free slots, need to wait till migrations will finish. Re-enqueue and return
if len(migrationCandidates) > 0 || len(nonMigrateable) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we handle the if len(migrationCandidates) > 0 || len(nonMigrateable) > 0 on line 372?

vmisToMigrate := vmisToMigrate(node, vmisOnNode, taint)
	if len(vmisToMigrate) == 0 {
		return nil
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xpivarc vmisToMigrate is a list of VMs marked for eviction because of their eviction strategy. migrationCandidates are VMs that marked and meet the conditions for live migration. nonMigrateable are VMs that were marked but don't meet the requirements for live migration

pkg/virt-controller/watch/drain/evacuation/evacuation.go Outdated Show resolved Hide resolved
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xpivarc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2022
@enp0s3 enp0s3 marked this pull request as ready for review November 16, 2022 10:33
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 16, 2022
If not considering the maximum per-source-node active migrations we
can have a scenario where we reach the maximum capacity of concurrent
migrations in a cluster, where most of the migrations are pending
because all the migrations are from the same source node.

This can be problematic in a case where multiple drains occur at
the same time, therefore evictions from different source nodes can
start simultaneously, but it won't happen because the migraiton
queue was occupied with VMs from the same source node.

The solution is to observe migration queue with active migrations
from same source node and limit the creation of new migrations up
to the ParallelOutboundMigrationsPerNode.

Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 16, 2022

/hold
looking at unit tests

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 16, 2022
Some of the tests didn't consider the ParallelOutboundMigrationsPerNode
value, whose default is 2, and its lower than the default
of ParallelMigrationsPerCluster (5)

Signed-off-by: enp0s3 <ibezukh@redhat.com>
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 16, 2022

/unhold
Addressed unit test issues

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 16, 2022
Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @enp0s3! Great stuff!
I have a tiny nit, feel free to save it to follow-up PRs
/lgtm

vmisOnNode []*virtv1.VirtualMachineInstance,
activeMigrations []*virtv1.VirtualMachineInstanceMigration) (activeMigrationsFromThisSourceNode int) {

vmiMap := make(map[string]bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to use make(map[string]struct{}) which will allocate less space

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 16, 2022
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 16, 2022

/retest-required

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Nov 16, 2022

@enp0s3: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.22-sig-storage 6116051 link true /test pull-kubevirt-e2e-k8s-1.22-sig-storage
pull-kubevirt-e2e-k8s-1.24-sig-storage-nonroot 6116051 link true /test pull-kubevirt-e2e-k8s-1.24-sig-storage-nonroot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@acardace
Copy link
Member

/retest-required

@kubevirt-bot kubevirt-bot merged commit ad0a6e1 into kubevirt:main Nov 17, 2022
@enp0s3
Copy link
Contributor Author

enp0s3 commented Nov 17, 2022

/cherry-pick release-0.58

@kubevirt-bot
Copy link
Contributor

@enp0s3: new pull request created: #8806

In response to this:

/cherry-pick release-0.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bbenshab
Copy link

bbenshab commented Dec 8, 2022

Results are here, and they are showing 533% faster migration, as we can see on the charts, on 4.11 we were almost exclusively migrating 1-2 VMs in parallel while on 4.12, its 9-16 VMs while we were migrating 11 VMs in parallel most of the time bringing down out total migration time from 2522 seconds down to 473.
tested on a 30 nodes cluster with 1000 VMs , and 500 VMs migrations.

liveMigrationConfig:
parallelMigrationsPerCluster: 20
parallelOutboundMigrationsPerNode: 4

OpenShift virtualization 4 12 0-755 (nightly)  - parallel migration results
OpenShift virtualization 4 11 1  - parallel migration results

@iholder101
Copy link
Contributor

Results are here, and they are showing 533% faster migration

Awesome job @enp0s3 and @bbenshab!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants