Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio #9312

Merged
merged 3 commits into from
Mar 1, 2023

Conversation

awels
Copy link
Member

@awels awels commented Feb 23, 2023

What this PR does / why we need it:
Setting the ratio for mem/cpu to 1 allows the pod to be created under any combination of mem/cpu request/limit ratio. Added functional test to demonstrate various ratios work without issue.

This is a quick fix for just hotplug attachment pod containers for issue #9302 We should investigate a better option that doesn't waste some resources for resource constrained environments.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/L labels Feb 23, 2023
Copy link
Member

@mhenriks mhenriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with this is mostly that it is's incomplete. It does not fully address #9302

What would a proper fix look like? Can the LimitRanges in effect be discovered? Can we detect this error and recreate the pod? Do we need to extend the KubeVirt API to allow our request/limit ration be configured?

And if we want to do a temp fix, should the 1:1 ratio be applied to all containers?

What do you think @vladikr @davidvossel @xpivarc?

}
}

func hotplugContainerMinimalLimits() k8sv1.ResourceList {
func hotplugContainerMinimalResources() k8sv1.ResourceList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer maintaining the existing pattern of *MinimalRequests and *MinimalLimits functions. In this case the request function can call the limit function. I just feel that is a pattern we will want to go back to eventually with a proper fix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name seemed wrong for what it was doing, that is why I changed it. I have no issues keeping the original naming it just didn't seem to fit with this particular change, that is why I changed the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the name as requested

@awels
Copy link
Member Author

awels commented Feb 27, 2023

/retest-required

@@ -1158,6 +1162,205 @@ var _ = SIGDescribe("Hotplug", func() {
})
})

Context("with limit range in namespace", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a unit test would be sufficient. WDYT?

Copy link
Member Author

@awels awels Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a functional test is better (yes it is slower) but we can demonstrate this actually works for various combinations of ratios in a real cluster. A unit test would just show us that indeed the ratio is 1 to 1. But we are interested in does the functionality (hotplugging volumes) still work when various combinations of ratios are defined in the namespace.

@awels
Copy link
Member Author

awels commented Mar 1, 2023

There will be a follow up PR of this that adds an API to allow one to specify the exact request/limit for both cpu and memory for all the different support containers. So this PR does NOT completely fix #9302 just the hotplug attachment pod container.

@xpivarc
Copy link
Member

xpivarc commented Mar 1, 2023

My concern with this is mostly that it is's incomplete. It does not fully address #9302

I am not sure what you mean by this.

What would a proper fix look like? Can the LimitRanges in effect be discovered? Can we detect this error and recreate the pod? Do we need to extend the KubeVirt API to allow our request/limit ration be configured?

And if we want to do a temp fix, should the 1:1 ratio be applied to all containers?

What do you think @vladikr @davidvossel @xpivarc?

I am not fun of tighter integration in Kubevirt.

@awels
Copy link
Member Author

awels commented Mar 1, 2023

My concern with this is mostly that it is's incomplete. It does not fully address #9302

I am not sure what you mean by this.

So #9302 is about more than just the hotplug attachment pod container, there are other containers in KubeVirt with a similar ratio that will break if you apply the LimitRange with a requestLimitRatio. This particular PR just addresses hotplug, not the other containers. I will be making a much more extensive follow up PR that addresses everything.

What would a proper fix look like? Can the LimitRanges in effect be discovered? Can we detect this error and recreate the pod? Do we need to extend the KubeVirt API to allow our request/limit ration be configured?
And if we want to do a temp fix, should the 1:1 ratio be applied to all containers?
What do you think @vladikr @davidvossel @xpivarc?

I am not fun of tighter integration in Kubevirt.

@mhenriks
Copy link
Member

mhenriks commented Mar 1, 2023

Assuming other containers will be addressed in subsequent pr

/lgtm
/approve

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 1, 2023
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 1, 2023
@xpivarc
Copy link
Member

xpivarc commented Mar 1, 2023

My concern with this is mostly that it is's incomplete. It does not fully address #9302

I am not sure what you mean by this.

So #9302 is about more than just the hotplug attachment pod container, there are other containers in KubeVirt with a similar ratio that will break if you apply the LimitRange with a requestLimitRatio. This particular PR just addresses hotplug, not the other containers. I will be making a much more extensive follow up PR that addresses everything.

What would a proper fix look like? Can the LimitRanges in effect be discovered? Can we detect this error and recreate the pod? Do we need to extend the KubeVirt API to allow our request/limit ration be configured?
And if we want to do a temp fix, should the 1:1 ratio be applied to all containers?
What do you think @vladikr @davidvossel @xpivarc?

I am not fun of tighter integration in Kubevirt.

Oh I missed that it is a general issue.

Copy link
Member

@xpivarc xpivarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

_, err = virtClient.VirtualMachineInstance(vm.Namespace).Get(context.Background(), vm.Name, &metav1.GetOptions{})
Expect(err).ToNot(HaveOccurred())
},
Entry("[Serial]1 to 1 cpu and mem ratio", float64(1), float64(1)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need a serial label as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. all the entries have a [serial] label

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now use gingko labels. Please stand by I am searching for a reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you need
Entry("[Serial]1 to 1 cpu and mem ratio", Serial, float64(1), float64(1)),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 1, 2023
… 1 to 1 ratio

Setting the ratio for mem/cpu to 1 allows the pod to be created under any combination
of mem/cpu request/limit ratio. Added functional test to demonstrate various ratios
work without issue.

Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
@awels awels force-pushed the set_hotplug_cpu_mem_ratio_to_1 branch from e043bbf to 5002793 Compare March 1, 2023 17:45
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 1, 2023
@xpivarc
Copy link
Member

xpivarc commented Mar 1, 2023

/hold cancel
/lgtm

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 1, 2023
@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 1, 2023
@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Mar 1, 2023

@awels: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-fossa 5002793 link false /test pull-kubevirt-fossa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@awels
Copy link
Member Author

awels commented Mar 1, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

@kubevirt-bot kubevirt-bot merged commit 3cdcfab into kubevirt:main Mar 1, 2023
@awels
Copy link
Member Author

awels commented Mar 2, 2023

/cherrypick release-0.59

@kubevirt-bot
Copy link
Contributor

@awels: new pull request created: #9352

In response to this:

/cherrypick release-0.59

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@awels
Copy link
Member Author

awels commented Mar 2, 2023

/cherrypick release-0.58

@kubevirt-bot
Copy link
Contributor

@awels: #9312 failed to apply on top of branch "release-0.58":

Applying: Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio
Using index info to reconstruct a base tree...
M	pkg/virt-controller/services/renderresources.go
M	tests/storage/hotplug.go
Falling back to patching base and 3-way merge...
Auto-merging tests/storage/hotplug.go
CONFLICT (content): Merge conflict in tests/storage/hotplug.go
Auto-merging pkg/virt-controller/services/renderresources.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-0.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

awels pushed a commit to awels/kubevirt that referenced this pull request Mar 2, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio
awels pushed a commit to awels/kubevirt that referenced this pull request Mar 2, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio
awels pushed a commit to awels/kubevirt that referenced this pull request Mar 2, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio

Signed-off-by: Alexander Wels <awels@redhat.com>
awels pushed a commit to awels/kubevirt that referenced this pull request Mar 2, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio

Signed-off-by: Alexander Wels <awels@redhat.com>
awels pushed a commit to awels/kubevirt that referenced this pull request Mar 2, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio

Signed-off-by: Alexander Wels <awels@redhat.com>
awels pushed a commit to awels/kubevirt that referenced this pull request Mar 7, 2023
…_to_1

Set mem/cpu request and limits on hotplug attachment pod container to 1 to 1 ratio

Signed-off-by: Alexander Wels <awels@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants