Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vmi-mutator-webhook]: do not inject LimitRange defaults into VMI #9054

Merged
merged 1 commit into from Jan 22, 2023

Conversation

enp0s3
Copy link
Contributor

@enp0s3 enp0s3 commented Jan 11, 2023

What this PR does / why we need it:

Currently when LimitRange is configured with defaults for compute resources such as CPU or memory,
these defaults are being injected to the VMI spec early in the VMI mutating stage.
Hence, when looking at the VMI spec we can't really know the source of information
for these values, whether they are user-defined or not.

This situation causes issues at the stage of launcher-pod rendering. The resource renderer assumes
the VMI spec values are user defined and prefers them over caclulated resource requests.

If the user defines CPU topology with CPU overcommit, the calculated CPU request is being ignored
in favor to the CPU request from the LimitRange.

For example: a VMI with configured CPU topology of 4 vCPU in spec.domain.CPU.cores
will have a resource request of 400m (num_of_cores * 1000 / AllocationRatio).
Lets take LimitRange Default CPU request as 100m.
The renderred virt-launcher pod will have 100m CPU request instead of 400m.

The proposed solution is to ignore the LimitRange defaults, giving priority
to user configuration, and later to calculated configuration.

Which issue(s) this PR fixes
https://bugzilla.redhat.com/show_bug.cgi?id=2152534

Special notes for the reviewer
If we exceed the resource limits of the LR we will
fail at the pod creation stage rather than on the VMI admission. I think this is the correct
behavior because of Type: Container. If the LimitRange would be configured on type VMI/VM for example
then perhaps it would make sense to fail on the VMI admission control.

Release note:

do not inject LimitRange defaults into VMI

@kubevirt-bot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jan 11, 2023
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 11, 2023

/test pull-kubevirt-e2e-k8s-1.25-sig-compute

@enp0s3 enp0s3 force-pushed the remove_ns_resreqs_from_vmi branch 6 times, most recently from 2b119a7 to 72dc20c Compare January 12, 2023 09:08
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 12, 2023

/test pull-kubevirt-e2e-k8s-1.25-sig-compute

@enp0s3 enp0s3 marked this pull request as ready for review January 12, 2023 18:41
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 12, 2023
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 12, 2023

/cc @stu-gott @vladikr

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 15, 2023

/cc @iholder101 @xpivarc @fabiand

@iholder101
Copy link
Contributor

Ah.. code deletion always feels good 😄
Great job @enp0s3!

Some other stuff that should maybe also be removed:

@vladikr
Copy link
Member

vladikr commented Jan 17, 2023

Ah.. code deletion always feels good smile Great job @enp0s3!

Some other stuff that should maybe also be removed:

Interesting that this test is passing now... @enp0s3 Is it really the case?

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 18, 2023

Ah.. code deletion always feels good smile Great job @enp0s3!
Some other stuff that should maybe also be removed:

Interesting that this test is passing now... @enp0s3 Is it really the case?

@vladikr Actually I've already removed these tests in the PR. But the Informers and RBAC still there, so I will remove them, thanks @iholder101 !

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 18, 2023

/retest-required

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

/hold
There is an issue with operator lanes with the downgrade->upgrade test, since we deleted the RBAC for LimitRange, the dongraded virt-api cannot access the LimitRange resource for watching.

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 19, 2023
@iholder101
Copy link
Contributor

/hold There is an issue with operator lanes with the downgrade->upgrade test, since we deleted the RBAC for LimitRange, the dongraded virt-api cannot access the LimitRange resource for watching.

WDYT about keeping the RBAC rules for a while and removing them in the future? Does that make sense?

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

@iholder101 Yes as a workaround I can remove them in a subsequent PR.

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

But, in general I will check the upgrade path, it looks suspicious, what happens when we change RBAC? I saw some logic regarding backup of RBAC rules if they are being modified, but yet it looks it doesn't help much.

Currently when LimitRange is configured with defaults for compute resources such as CPU or memory,
these defaults are being injected to the VMI spec early in the VMI mutating stage.
Hence, when looking at the VMI spec we can't really know the source of information
for these values, whether they are user-defined or not.

This situation causes issues at the stage of launcher-pod rendering. The resource renderer assumes
the VMI spec values are user defined and prefers them over caclulated resource requests.

If the user defines CPU topology with CPU overcommit, the calculated CPU request is being ignored
in favor to the CPU request from the LimitRange.

For example: a VMI with configured CPU topology of 4 vCPU in spec.domain.CPU.cores
will have a resource request of 400m (num_of_cores * 1000 / AllocationRatio).
Lets take LimitRange Default CPU request as 100m.
The renderred virt-launcher pod will have 100m CPU request instead of 400m.

The proposed solution is to ignore the LimitRange defaults, giving priority
to user configuration, and later to calculated configuration.

Signed-off-by: enp0s3 <ibezukh@redhat.com>
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

@iholder101 Actually this may cause an issue with downgrades i.e. for some reason an upgrade fails and user wants to downgrade to previous version

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 19, 2023
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

/retest-required

@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 19, 2023

/cc @acardace

@acardace
Copy link
Member

/approve

Nice to see this simplification, thanks @enp0s3 !

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: acardace

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 20, 2023
@iholder101
Copy link
Contributor

/lgtm
Great work!

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 22, 2023
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 22, 2023

/retest-required

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Jan 22, 2023

@enp0s3: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-fossa 994ac49 link true /test pull-kubevirt-fossa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit f510c71 into kubevirt:main Jan 22, 2023
@enp0s3
Copy link
Contributor Author

enp0s3 commented Jan 23, 2023

/cherry-pick release-0.58

@kubevirt-bot
Copy link
Contributor

@enp0s3: #9054 failed to apply on top of branch "release-0.58":

Applying: Do not inject LimitRange defaults into VMI
Using index info to reconstruct a base tree...
M	pkg/controller/virtinformers.go
M	pkg/virt-api/api.go
M	pkg/virt-api/webhooks/mutating-webhook/mutators/BUILD.bazel
M	pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits.go
M	pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits_test.go
M	pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator.go
M	pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator_test.go
M	tests/vmi_configuration_test.go
Falling back to patching base and 3-way merge...
Auto-merging tests/vmi_configuration_test.go
CONFLICT (content): Merge conflict in tests/vmi_configuration_test.go
Auto-merging pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator_test.go
CONFLICT (content): Merge conflict in pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator_test.go
Auto-merging pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator.go
CONFLICT (content): Merge conflict in pkg/virt-api/webhooks/mutating-webhook/mutators/vmi-mutator.go
CONFLICT (modify/delete): pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits_test.go deleted in Do not inject LimitRange defaults into VMI and modified in HEAD. Version HEAD of pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits_test.go left in tree.
CONFLICT (modify/delete): pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits.go deleted in Do not inject LimitRange defaults into VMI and modified in HEAD. Version HEAD of pkg/virt-api/webhooks/mutating-webhook/mutators/namespace-limits.go left in tree.
Auto-merging pkg/virt-api/webhooks/mutating-webhook/mutators/BUILD.bazel
Auto-merging pkg/virt-api/api.go
Auto-merging pkg/controller/virtinformers.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Do not inject LimitRange defaults into VMI
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-0.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants