--kube-reserved and --system-reserved are not working #72762

y-koseki · 2019-01-10T03:58:32Z

What happened:

I have ran kubelet with parameter:

--kube-reserved=cpu=2,memory=2Gi,ephemeral-storage=1Gi
--system-reserved=cpu=500m,memory=1Gi,ephemeral-storage=3Gi
--eviction-hard=memory.available<500Mi,nodefs.available<10%

The capacity of k8s node VM is as follows.

Capacity:
 cpu:                16
 ephemeral-storage:  31444004Ki
 hugepages-2Mi:      0
 memory:             32780296Ki
 pods:               110
Allocatable:
 cpu:                13500m
 ephemeral-storage:  24683826743
 hugepages-2Mi:      0
 memory:             29122568Ki
 pods:               110

Problem1

Pods can use ephemeral-storage over Allocatable.
The result of curl https://${master_name}/api/v1/nodes/${node_name}/proxy/stats/summary | jq .node.fs is as follows.

{
  "time": "2019-01-09T11:46:12Z",
  "availableBytes": 4850827264,
  "capacityBytes": 32198660096,
  "usedBytes": 27347832832,
  "inodesFree": 9475345,
  "inodes": 9504864,
  "inodesUsed": 29519
}

Allocatable is 24683826743 byte.
usedBytes is 27347832832 bytes.

Problem2

Pods can use CPU over Allocatable.
The result of kubectl top pods is as follows.

NAME                CPU(cores)   MEMORY(bytes)
stress-pod-cpu1-1   990m         2Mi
stress-pod-cpu1-2   970m         2Mi
stress-pod-cpu13    12811m       257Mi
test-pd-3           0m           10Mi

Allocatable is 13500m.
The total value of CPU is 14771m.

The result of curl https://${master_name}/api/v1/nodes/${node_name}/proxy/stats/summary | jq .node.cpu is as follows.

{
  "time": "2019-01-09T11:55:45Z",
  "usageNanoCores": 14544109279,
  "usageCoreNanoSeconds": 24011030451539372
}

What you expected to happen:

I expected that Pods can NOT use ephemeral-storage and CPU over Allocatable.
It seems that --kube-reserved and --system-reserved are not working.
I have also tried to run kubelet with parameter:

--kube-reserved=cpu=2,memory=2Gi,ephemeral-storage=1Gi
--system-reserved=cpu=500m,memory=1Gi,ephemeral-storage=3Gi
--eviction-hard=memory.available<500Mi,nodefs.available<10%
--kube-reserved-cgroup=/system.slice
--system-reserved-cgroup=/system.slice
--enforce-node-allocatable=pods,system-reserved,kube-reserved

However, it did not resolve the problems.

How to reproduce it (as minimally and precisely as possible):

Run kubelet with parameter above.

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:Fujitsu Cloud Service for OSS IaaS
OS (e.g. from /etc/os-release):

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel (e.g. uname -a):

Linux {hostname} 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Install tools:
Others:

The text was updated successfully, but these errors were encountered:

y-koseki · 2019-01-10T04:04:54Z

/area kubelet
/sig node

derekwaynecarr · 2019-02-12T02:37:41Z

@y-koseki the reservation appears to have been properly applied looking at the allocatable capacity reported back to the scheduler. what --cgroup-manager flag did you specify? is it possible for you to report the cgroupfs values you see under kubepods.slice for cpu and memory?

derekwaynecarr · 2019-02-12T02:38:50Z

fyi @dashpole have you seen this? i am not aware of anyone that is enforcing-node-allocatable in production for anything other than pods but its possible we have a bug that needs more investigation.

dashpole · 2019-02-12T02:46:47Z

Problem 1 looks like it is looking at the disk usage of the entire node, not the allocatable usage only. It is also worth pointing out that the kubelet only enforces allocatable for ephemeral storage through monitoring + response, so usage by pods can temporarily exceed allocatable.

dashpole · 2019-02-12T02:51:07Z

Problem 2 looks like it might be a real bug. Although the problem with metrics from kubectl top is that they are 10s averages, so I am not 100% sure. I think I might have seen something like that before, but didn't dig into it. It very well could be a bug.

My first thoughts are:
Do we still set cpu.shares for the kube-reserved and system-reserved cgroups even when they are not enforced? I would think you either need to "enforce" cpu allocatable on everything (pods, kube, and system) or none of them.

Do we enforce that the kube-reserved cgroup and system-reserved cgroup have the same parent cgroup as kubepods? If not, I don't think cpu shares are correctly calculated, as cpu time is split proportionally to shares among cgroups with the same parent.

fejta-bot · 2019-05-13T03:00:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-06-12T03:42:56Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-07-12T04:31:50Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-07-12T04:31:57Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

y-koseki added the kind/bug Categorizes issue or PR as related to a bug. label Jan 10, 2019

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 10, 2019

k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 10, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 12, 2019

k8s-ci-robot closed this as completed Jul 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--kube-reserved and --system-reserved are not working #72762

--kube-reserved and --system-reserved are not working #72762

y-koseki commented Jan 10, 2019

y-koseki commented Jan 10, 2019

derekwaynecarr commented Feb 12, 2019

derekwaynecarr commented Feb 12, 2019

dashpole commented Feb 12, 2019

dashpole commented Feb 12, 2019 •

edited

Loading

fejta-bot commented May 13, 2019

fejta-bot commented Jun 12, 2019

fejta-bot commented Jul 12, 2019

k8s-ci-robot commented Jul 12, 2019

--kube-reserved and --system-reserved are not working #72762

--kube-reserved and --system-reserved are not working #72762

Comments

y-koseki commented Jan 10, 2019

y-koseki commented Jan 10, 2019

derekwaynecarr commented Feb 12, 2019

derekwaynecarr commented Feb 12, 2019

dashpole commented Feb 12, 2019

dashpole commented Feb 12, 2019 • edited Loading

fejta-bot commented May 13, 2019

fejta-bot commented Jun 12, 2019

fejta-bot commented Jul 12, 2019

k8s-ci-robot commented Jul 12, 2019

dashpole commented Feb 12, 2019 •

edited

Loading