Guaranteed PODs CPUs are shared by other process running on the same host #99895

DapengJiao · 2021-03-06T14:46:12Z

What happened:

Guaranteed PODs CPUs are shared by other process running on the same host

What you expected to happen:

This pod runs in the Guaranteed QoS class should be granted with exclusive CPUs.

How to reproduce it (as minimally and precisely as possible):

Deploy K8S cluster (1.20.5) on OpenStack with 3 master and 2 workers, each worker VM has 32 vCPUs
Set "--cpu-manager-policy=status" and configure "--reserved-cpus=0-1" on each workers' kubelet configuration
Set "CPUAffinity=0-1" on "/etc/systemd/system.conf" configuration file
Deploy a set of dummy "nginx" Pods with a deployment of 20 replicas, each replicas requests&limits 4 CPUs

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 20
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        resources:
          limits:
            memory: "200Mi"
            cpu: "4"
          requests:
            memory: "200Mi"
            cpu: "4"

After Pods are created, login each worker and check its "cpu_manager_state" and process status

worker-pool1-vam0h3j8-eccd-cluster-dapeng:/home/eccd # cat /var/lib/kubelet/cpu_manager_state | jq .
{
  "policyName": "static",
  "defaultCpuSet": "0-1,34-35",
  "entries": {
    "31c83fdd-1468-4877-95e5-4d774586eb0d": {
      "nginx": "2-5"
    },
    "32ea3dd9-d886-448f-a11d-ab0ec4ba0652": {
      "nginx": "14-17"
    },
    "566567ce-a71c-44c9-9052-036b4351c056": {
      "nginx": "30-33"
    },
    "6c753f14-57f4-48d0-90a2-e6d554a3bb49": {
      "nginx": "26-29"
    },
    "a3d9c419-2e59-4ac1-87e9-879f4e9e8fc7": {
      "nginx": "10-13"
    },
    "cc18e830-0129-479a-ad15-94a71efdeb8b": {
      "nginx": "18-21"
    },
    "f2d73bc5-8264-4931-bbc0-6c8b91c3db18": {
      "nginx": "6-9"
    },
    "f9717485-d049-4296-8322-b3fab800bf90": {
      "nginx": "22-25"
    }
  },
  "checksum": 3030712563
}
worker-pool1-vam0h3j8-eccd-cluster-dapeng:/home/eccd # ps -Ao user,uid,comm,pid,pcpu,psr | awk '{if ($5!=0.0) {print}}' | awk '{if ($6!=0) {print}}' | awk '{if ($6!=20) {print}}' | awk '{if ($6!=40) {print}}' | awk '{if ($6!=60) {print}}'
USER       UID COMMAND           PID %CPU PSR
root         0 systemd             1  0.6   1
root         0 rcu_sched           8  0.2  32
root         0 ksoftirqd/1        16  0.1   1
message+   499 dbus-daemon      1158  0.3   1
root         0 docker-containe  1694  0.1   1
26          26 python           1753  0.4   1
root         0 dockerd          1795  5.5   1
root         0 diag_coll_worke  1798  0.2   1
root         0 docker-containe  2663  0.8   1
root         0 calico-node      5594  1.8  35
root         0 node-cache       5992  0.3  22
root         0 pause            6999  0.2   4
root         0 pause            7068  0.2   3
root         0 pause            7099  0.2   8
root         0 pause            7112  0.2  17
root         0 pause            7169  0.1   5
eccd      1001 systemd          9310  0.7   1
53222    53222 java            10031  0.5   4
101        101 nginx-ingress-c 10436  4.0   4
root         0 pause           12180  0.6   5
root         0 pause           12207  0.6   9
root         0 pause           12224  0.6   7
root         0 sadc            15778  3.0   1
root         0 ps              15781  200   1
9685      9685 registry        15885  0.2  14
21414    21414 java            16250  0.6   5
47040    47040 prometheus      22122  4.5   3
root         0 kworker/1:0     24405  0.1   1
root         0 node_exporter   25094  0.7   1
root         0 alertmanager    25311  0.3  23
root         0 node-cert-expor 28684  0.1  28

From the output we could see, quite many processes are running on CPU which belong to guaranteed cpu_sets

Anything else we need to know?:

If we reboot the worker node, then the processes which running on guaranteed cpu sets will moved to defaultCpuSet.

Environment:

Kubernetes version (use kubectl version):

kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"ec2760d6d916781de466541a6babb4309766c995", GitTreeState:"clean", BuildDate:"2021-02-27T17:24:15Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"ec2760d6d916781de466541a6babb4309766c995", GitTreeState:"clean", BuildDate:"2021-02-27T17:18:03Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:
OpenStack
OS (e.g: cat /etc/os-release):

NAME="SLES"
VERSION="15-SP1"
VERSION_ID="15.1"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp1"

Kernel (e.g. uname -a):
Linux director-0-eccd-cluster-dapeng 4.12.14-197.83-default #1 SMP Thu Feb 11 22:01:45 UTC 2021 (547a203) x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2021-03-06T14:46:19Z

@DapengJiao: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

DapengJiao · 2021-03-06T14:47:04Z

/sig node

maxlaverse · 2021-03-11T22:13:43Z

Hi @DapengJiao,
What you describe is how the feature is suppose to work, isn't it ? (asking because of the bug label you added)

https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/

Note: System services such as the container runtime and the kubelet itself can continue to run on these exclusive CPUs. The exclusivity only extends to other pods.

The --reserved-cpus=0-1 parameter tells Kubelet to not schedule any Pod on those CPUs, but this doesn't mean processes outside Kubernetes won't run on CPU 0 and 1. It's "exclusive" in regard of other Pods, and not in regard of any other process running on Linux (at least that's how I understood it)

I suppose you actually added "CPUAffinity=0-1" in "/etc/systemd/system.conf" to achieve this, and prevent non-Kubernetes Pods to run on those cores.

I was wondering how to dedicate CPUs even more myself. Have you tried something around isolcpus ?
https://unix.stackexchange.com/questions/326579/how-to-ensure-exclusive-cpu-availability-for-a-running-process

DapengJiao · 2021-03-12T09:27:55Z

Hi @maxlaverse

Thanks for your answer.
To be honest, I want to label as "question" instead of "bug".

I am aware of the statement(Note:) you pasted. But I was considering kubelet should not allocate exclusive CPUs from the cpuset which are already using by other process (container runtime, kubelet or other process from cloud infra layer)

For isolcpus I remember there was some discussion for that, and the conclusion is that isolcpusis not working together with --reserved-cpus. #87862

ehashman · 2021-03-16T00:16:11Z

/kind support
/remove-kind bug

xiaoxubeii · 2021-04-13T01:42:38Z

kubelet cpu manager cannot exclude binding cores to any other processes except other pods now. I think maybe you should try workaround first.

cynepco3hahue · 2021-06-24T13:11:04Z

The CPU manager does not guarantee that only the pod will run on a specific set of CPUs(via cgroup cpuset).

so you have non-container processes, should be fixed via OS configuration(moving interrupts, specifying CPUAffinity for systems services...)
pause containers, should be solved on the CRI level because the Kubelet does not monitor the pause container(pod wrapper) at all, I provided the functionality for CRI-O(Provide functionality to start infra containers on the specified set of CPUs cri-o/cri-o#4459), but probably the containerd should do something similar

cynepco3hahue · 2021-06-24T13:11:53Z

/close
Please free to open KEP or feature requests for the future CPU manager imrovement.

k8s-ci-robot · 2021-06-24T13:11:59Z

@cynepco3hahue: Closing this issue.

In response to this:

/close
Please free to open KEP or feature requests for the future CPU manager imrovement.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

maxpain · 2022-07-29T03:16:36Z

I have a pod, that has two containers.
I want one container to use 1 CPU exclusively with affinity and another container to use a part of a CPU from a shared pool.
Is it possible?

DapengJiao added the kind/bug Categorizes issue or PR as related to a bug. label Mar 6, 2021

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 6, 2021

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 6, 2021

k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Mar 16, 2021

k8s-ci-robot closed this as completed Jun 24, 2021

james-masson mentioned this issue Apr 30, 2024

Sandbox/Pause containers should run on a configurable cpuset to avoid interfering with low-jitter workloads containerd/containerd#10155

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guaranteed PODs CPUs are shared by other process running on the same host #99895

Guaranteed PODs CPUs are shared by other process running on the same host #99895

DapengJiao commented Mar 6, 2021

k8s-ci-robot commented Mar 6, 2021

DapengJiao commented Mar 6, 2021

maxlaverse commented Mar 11, 2021

DapengJiao commented Mar 12, 2021

ehashman commented Mar 16, 2021

xiaoxubeii commented Apr 13, 2021

cynepco3hahue commented Jun 24, 2021

cynepco3hahue commented Jun 24, 2021

k8s-ci-robot commented Jun 24, 2021

maxpain commented Jul 29, 2022

Guaranteed PODs CPUs are shared by other process running on the same host #99895

Guaranteed PODs CPUs are shared by other process running on the same host #99895

Comments

DapengJiao commented Mar 6, 2021

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

k8s-ci-robot commented Mar 6, 2021

DapengJiao commented Mar 6, 2021

maxlaverse commented Mar 11, 2021

DapengJiao commented Mar 12, 2021

ehashman commented Mar 16, 2021

xiaoxubeii commented Apr 13, 2021

cynepco3hahue commented Jun 24, 2021

cynepco3hahue commented Jun 24, 2021

k8s-ci-robot commented Jun 24, 2021

maxpain commented Jul 29, 2022