Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix windows platform cpu limit does not work properly #85380

Closed
wants to merge 3 commits into from

Conversation

@wawa0210
Copy link
Contributor

wawa0210 commented Nov 16, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind bug

What this PR does / why we need it:

Fix windows platform cpu limit does not work properly

Since docker 18.03.1-ee-9, the Windows container has supported cpu limit, but this feature is not implemented in kubernetes code。we can use docker run --cpus 0.25 Specify the number of CPUs that can be used

moby/moby#39190
https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/resource-controls

Which issue(s) this PR fixes:

Fixes #85379

Special notes for your reviewer:

I did the following test

  1. Create a deplyment and setting the cpu limit ,the yaml such as

image

  1. When the deployment scheduled done, i exec docker inspect watch the container on windows node and the results such as :

image

We can see that the container is running in process mode and that NanoCpus is set to the correct value.

  1. I do the same messure test in container by the same scripts such as
foreach ($loopnumber in 1..4){
    Start-Job -ScriptBlock{
    $result = 1
	foreach($mm in 1..2147483647){$res1=1;foreach($num in 1..2147483647){$res1=$mm*$num*1340371};$res1}
    }
}

And the results such as :
image

We can see that the cpu limit has been able to work normally.

By reading the official documentation of docker, it is found that HostConfig. NanoCPUs can be used to accurately limit the amount of cpi used in containers. docker api doc

The docker source code is as follows
https://github.com/moby/moby/blob/10866714412aea1bb587d1ad14b2ce1ba4cf4308/daemon/oci_windows.go#L408-L443

Does this PR introduce a user-facing change?:

Fix windows container cpu limit does not work

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


cc @feiskyer

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 16, 2019

Hi @wawa0210. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested review from mtaufen and tallclair Nov 16, 2019
@k8s-ci-robot k8s-ci-robot added kind/bug and removed needs-kind labels Nov 16, 2019
Copy link
Contributor

mattjmcnaughton left a comment

/ok-to-test

Thanks for your pr @wawa0210 !

Either in your commit message or in your pr description, can you please share more about why you think this change will solve the underlying issue? It would also be great to add a unit test to ensure this functionality continues to work going forward.

Additionally, could you please add a release note?

Thanks :)

cpuCount := int64(cpuLimit.MilliValue()+999) / 1000
wc.Resources.CpuCount = cpuCount
cpuCount := int64(cpuLimit.MilliValue()+999) / 1000
wc.Resources.CpuCount = cpuCount

This comment has been minimized.

Copy link
@feiskyer

feiskyer Nov 17, 2019

Member

AFAIK, CpuCount is only supported by HyperV isolation.

cc @PatrickLang

This comment has been minimized.

Copy link
@wawa0210

wawa0210 Nov 18, 2019

Author Contributor

Just confirmed, I came to the test environment based on the process isolation level of windows container instead of hyper-v

This comment has been minimized.

This comment has been minimized.

Copy link
@wawa0210

wawa0210 Dec 10, 2019

Author Contributor

I've seen docker's processing order before. It's worth mentioning that there is no limit on the --cpus parameter in docker. This parameter seems to be able to more accurately limit the use of CPUs in the windows container.
However, in Kubernetes Windows code, this parameter is not assigned, and at the same time, cpucount is not assigned, so the existing logic cannot limit the use of windows server container CPU

This comment has been minimized.

Copy link
@wawa0210

wawa0210 Dec 10, 2019

Author Contributor

@PatrickLang The windows container parameter cpuMaximum is assigned through the NanoCPUs parameter. Docker does not provide an interface, and directly assigns cpuMaximum. Therefore, we should only assign the parameter NanoCPUs, and then call docker.

Here's precedence in Moby code
https://github.com/moby/moby/blob/2d467dc8d00790da8f4f8d1c0ab3ed7811b7958f/daemon/oci_windows.go#L413-L432

@wawa0210 wawa0210 force-pushed the wawa0210:fix/windows-cpulimit branch from fae7ab3 to 6211323 Nov 18, 2019
@k8s-ci-robot k8s-ci-robot added size/XL and removed size/XS labels Nov 18, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 18, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wawa0210
To complete the pull request process, please assign derekwaynecarr
You can assign the PR to them by writing /assign @derekwaynecarr in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wawa0210

This comment has been minimized.

Copy link
Contributor Author

wawa0210 commented Nov 18, 2019

/ok-to-test

Thanks for your pr @wawa0210 !

Either in your commit message or in your pr description, can you please share more about why you think this change will solve the underlying issue? It would also be great to add a unit test to ensure this functionality continues to work going forward.

Additionally, could you please add a release note?

Thanks :)

Done!

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Dec 2, 2019

/sig windows
cc @marosset

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Dec 2, 2019

@marosset and I are going to double check that this CRI change is consistent with what will be supported in ContainerD in a few ways:

  • NanoCores should have the same behavior on Docker & ContainerD, and be supported with both process and hyper-v isolation
  • The same order of precedence should be followed for conflicting settings (shares vs cpu count / nanocores) for both Docker & ContainerD
@wawa0210 wawa0210 closed this Dec 3, 2019
SIG-Windows automation moved this from Backlog (v1.18) to Done (v1.17) Dec 3, 2019
@wawa0210 wawa0210 reopened this Dec 3, 2019
@wawa0210

This comment has been minimized.

Copy link
Contributor Author

wawa0210 commented Dec 3, 2019

@marosset and I are going to double check that this CRI change is consistent with what will be supported in ContainerD in a few ways:

  • NanoCores should have the same behavior on Docker & ContainerD, and be supported with both process and hyper-v isolation
  • The same order of precedence should be followed for conflicting settings (shares vs cpu count / nanocores) for both Docker & ContainerD

Through testing, nanocores can be more fine-grained and more restrictive than cpu count. Nanocores support non-integer cpu, such as 2000000000 = 2 CPU cores, but cpu count can only be accurate to integers, such as 1 cpu, and cannot be fine-grained

@PatrickLang PatrickLang moved this from Done (v1.17) to Backlog (v1.18) in SIG-Windows Dec 3, 2019
@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Dec 3, 2019

Also related - #85856 . This shows the problem with conflicting fields and the different behaviors on Docker / ContainerD

@wawa0210

This comment has been minimized.

Copy link
Contributor Author

wawa0210 commented Dec 4, 2019

Also related - #85856 . This shows the problem with conflicting fields and the different behaviors on Docker / ContainerD

As far as I know, these two questions don't seem to be the same question. The problem with @ adelina-t seems to be that cpu limit has no priority set on containerd, which causes it to not work (docker has it set). My problem is that the cpu limit does not work properly. On the original basis, even if the cpu limit is set, the actual cpu usage will still reach 100%.

# Conflicts:
#	staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.pb.go
@k8s-ci-robot k8s-ci-robot added size/M and removed size/XL labels Dec 4, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Dec 4, 2019

@wawa0210: Adding label do-not-merge/contains-merge-commits because PR contains merge commits, which are not allowed in this repository.
Use git rebase to reapply your commits on top of the target branch. Detailed instructions for doing so can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/M labels Dec 4, 2019
@PatrickLang PatrickLang moved this from Backlog (v1.18) to In Progress (v1.18) in SIG-Windows Dec 6, 2019
@feiskyer

This comment has been minimized.

Copy link
Member

feiskyer commented Dec 10, 2019

Another related PR: #86101

@@ -76,7 +77,7 @@ func (m *kubeGenericRuntimeManager) generateWindowsContainerConfig(container *v1
wc.Resources.CpuMaximum = cpuMaximum
}

cpuShares := milliCPUToShares(cpuLimit.MilliValue(), isolatedByHyperv)
cpuShares := milliCPUToShares(cpuLimitMillValue, isolatedByHyperv)
if cpuShares == 0 {

This comment has been minimized.

Copy link
@PatrickLang

PatrickLang Dec 11, 2019

Contributor

Are you seeing any warnings from Docker in this case? I think that if HostConfig.CpuShares is passed to Docker, that still takes precedence over NanoCpus.

This comment has been minimized.

Copy link
@wawa0210

wawa0210 Dec 11, 2019

Author Contributor

Yeap,It seems that cannot assign values ​​to both parameters at the same time

image

@@ -682,6 +682,8 @@ message WindowsContainerResources {
int64 cpu_maximum = 3;
// Memory limit in bytes. Default: 0 (not specified).
int64 memory_limit_in_bytes = 4;
// Amount of CPUs (e.g. 2000000000 = 2 CPU cores)
int64 nano_cpus = 5;

This comment has been minimized.

Copy link
@PatrickLang

PatrickLang Dec 11, 2019

Contributor

I managed to avoid modifying CRI in #86101, but am still testing it. Can you review the approach there and let me know what you think?

This comment has been minimized.

Copy link
@wawa0210

wawa0210 Dec 11, 2019

Author Contributor

Already replay :)

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Jan 14, 2020

Are you ok with closing this since you've reviewed the other approach in #86101 ? I also added tests there.

@wawa0210

This comment has been minimized.

Copy link
Contributor Author

wawa0210 commented Jan 14, 2020

Are you ok with closing this since you've reviewed the other approach in #86101 ? I also added tests there.

Done .

@wawa0210 wawa0210 closed this Jan 14, 2020
SIG-Windows automation moved this from In Progress (v1.18) to Done (v1.18) Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.