Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The container’s CpusetCpus information needs to be added to the CRI Container definition #100906

Closed
chenyw1990 opened this issue Apr 8, 2021 · 8 comments · Fixed by #101771
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@chenyw1990
Copy link
Contributor

What happened:

If the K8s runs on bare metal, hundreds of containers run on a node, and the reconcileState method of CPUManager is executed every 10 seconds.
reccileState will call the updateContainerCPUSet method for each container to update the cpusetcpus of the container,
This cause too many cri requests to container runtime, and the cpu usage of container runtime will be high because of too many cri request.

What you expected to happen:

add the cpusetcpus to the Container definition of the CRI. reconcileState call updateContainerCPUSet only when the cpusetcpus of container changes.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): v1.19.4
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@chenyw1990 chenyw1990 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 8, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 8, 2021
@chenyw1990
Copy link
Contributor Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 8, 2021
@chenyw1990
Copy link
Contributor Author

@dims

@zouyee
Copy link
Member

zouyee commented Apr 12, 2021

/assign

@JornShen
Copy link
Member

@SergeyKanzhelev
Copy link
Member

Looks like a legit scalability concern.

/triage accepted
KEP: kubernetes/enhancements#693

Perhaps, needs to be added as a blocker for topology manager GA

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 12, 2021
@SergeyKanzhelev
Copy link
Member

CC: @klueska wdyt?

@klueska
Copy link
Contributor

klueska commented Apr 13, 2021

This is a known issue, we just haven't found the time to work on it. I don't think we need to add anything to the Container definition of the CRI. All of the information needed to skip the update is already available.

@SergeyKanzhelev
Copy link
Member

This is a known issue, we just haven't found the time to work on it. I don't think we need to add anything to the Container definition of the CRI. All of the information needed to skip the update is already available.

created kubernetes/enhancements#2623 so it is not lost on graduation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants