Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl shows node memory >100%? #86499

Closed
roy-work opened this issue Dec 20, 2019 · 19 comments · Fixed by #102917
Closed

kubectl shows node memory >100%? #86499

roy-work opened this issue Dec 20, 2019 · 19 comments · Fixed by #102917
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@roy-work
Copy link

roy-work commented Dec 20, 2019

What happened:

We ran the following, and got the following:

$ kubectl top nodes
NAME                       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
aks-nodepool1-35944238-0   320m         8%     15080Mi         54%
aks-nodepool1-35944238-1   355m         9%     28048Mi         101%
aks-nodepool1-35944238-2   544m         13%    24650Mi         88%
aks-nodepool1-35944238-3   325m         8%     2170Mi          7%
aks-nodepool1-35944238-4   511m         13%    14992Mi         54%
aks-nodepool1-35944238-5   516m         13%    25332Mi         91%

How/why do we have a node reporting >100% memory usage? (There seems to be plenty of memory on the host as reported by the kernel's MemAvailable statistic. (multiple gigabytes)

What you expected to happen:

Memory usage can't exceed 100%, no?

How to reproduce it (as minimally and precisely as possible): we unfortunately don't know

Anything else we need to know?: no swap on these VMs. We're curious what kernel memory statistic goes into computing the total from Kubernetes; it's my understanding that there is various ways to go over 100%, e.g., by summing RSS over several processes. (E.g., shared and resident pages would get double-counted.)

Environment:

  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.12", GitCommit:"524c3a1238422529d62f8e49506df658fa9c8b8c", GitTreeState:"clean", BuildDate:"2019-11-14T05:26:24Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Azure
  • OS (e.g: cat /etc/os-release): AKS instance?
  • Kernel (e.g. uname -a): (unsure, since this is Azure AKS; we don't have good access to this piece of data…)
  • Install tools: Azure?
  • Network plugin and version (if this is a network-related bug): N/A / Azure
  • Others: None
@roy-work roy-work added the kind/bug Categorizes issue or PR as related to a bug. label Dec 20, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 20, 2019
@roy-work
Copy link
Author

(Taking my best shot here, but if this is wrong please freely adjust it.)

/sig instrumentation

@k8s-ci-robot k8s-ci-robot added sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 20, 2019
@tedyu
Copy link
Contributor

tedyu commented Dec 21, 2019

Other than 101% reporting issue, do you observe any other abnormality ?

@roy-work
Copy link
Author

Well, yes, the node didn't seem to be actually at 100% memory use; as mentioned, it seemed to have significant headroom.

@haosdent
Copy link
Member

how about kubectl describe node aks-nodepool1-35944238-1

@serathius
Copy link
Contributor

serathius commented Feb 24, 2020

Node Memory utilization is ratio of node Working Set Bytes and node allocatable memory.

Allocatable memory is available on node object:
kubectl describe node aks-nodepool1-35944238-1

Node Working Set Bytes are available on api:
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/aks-nodepool1-35944238-1

Please provide results from those commands so we can distinquish if problem is in kubectl top or in metrics pipeline.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 24, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 24, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pdabrowski-it-solutions

I have similar problem:

NAME                        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%     
default-pool-2meb4gv3qy     357m         8%     1926Mi          72%         
secondary-pool-dqiqzzikb5   155m         3%     1061Mi          150%        
secondary-pool-kypbaua5an   82m          2%     884Mi           125% 

kubectl describe node secondary-pool-dqiqzzikb5

CreationTimestamp:  Mon, 19 Apr 2021 23:33:10 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  secondary-pool-dqiqzzikb5
  AcquireTime:     <unset>
  RenewTime:       Tue, 20 Apr 2021 22:24:21 +0200
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Mon, 19 Apr 2021 23:33:51 +0200   Mon, 19 Apr 2021 23:33:51 +0200   CalicoIsUp                   Calico is running on this node
  MemoryPressure       False   Tue, 20 Apr 2021 22:22:37 +0200   Tue, 20 Apr 2021 01:01:50 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 20 Apr 2021 22:22:37 +0200   Tue, 20 Apr 2021 01:01:50 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 20 Apr 2021 22:22:37 +0200   Tue, 20 Apr 2021 01:01:51 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 20 Apr 2021 22:22:37 +0200   Tue, 20 Apr 2021 01:01:51 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  ExternalIP:  ***
  Hostname:    secondary-pool-dqiqzzikb5
Capacity:
  cpu:                4
  ephemeral-storage:  41218368Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             1872412Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  37986847886
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             721436Ki
  pods:               110
System Info:
  Machine ID:                 ***
  System UUID:                ***
  Boot ID:                    ***
  Kernel Version:             3.10.0-1160.24.1.el7.x86_64
  OS Image:                   CentOS Linux 7 (Core)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.4
  Kubelet Version:            v1.18.15
  Kube-Proxy Version:         v1.18.15

kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/secondary-pool-dqiqzzikb5

{"kind":"NodeMetrics","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"name":"secondary-pool-dqiqzzikb5","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/secondary-pool-dqiqzzikb5","creationTimestamp":"2021-04-20T20:28:21Z"},"timestamp":"2021-04-20T20:27:14Z","window":"30s","usage":{"cpu":"197390796n","memory":"1085572Ki"}}

I have no idea what's causing it or how to fix :/

@k8s-ci-robot
Copy link
Contributor

@pdabrowski-it-solutions: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bysnupy added a commit to bysnupy/kubernetes that referenced this issue Jun 28, 2021
The memory usage result sometimes can be 100+% overflow result. Because the memory usage calculation is based on logical "Allocatable" node total memory which depends on "[Allocatable] = [Node Capacity] - [system-reserved] - [Hard-Eviction-Thresholds]", not actual host total memory.
* Fix: kubernetes#86499
* Reference: kubernetes#100222
bysnupy added a commit to bysnupy/kubernetes that referenced this issue Jul 8, 2021
…l node memory total usage.

If "Allocatable" is used to a node total memory size, under high memory pressure or pre-reserved memory value is bigger, the "MEMORY%" can be bigger than 100%.
For suppressing the confusing, add a option to show node real memory usage based on "Capacity".
* Reference: kubernetes#86499
@ydcool
Copy link

ydcool commented Aug 6, 2021

/reopen
We have the same issue.

@k8s-ci-robot
Copy link
Contributor

@ydcool: Reopened this issue.

In response to this:

/reopen
We have the same issue.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Aug 6, 2021
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 6, 2021
@dashpole
Copy link
Contributor

/remove-lifecycle rotten
/triage accepted
/assign @serathius

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 12, 2021
@GrumpyRainbow
Copy link

Seeing same issue here too

NAME                          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
aks-web-30425474-vmss000007   79m          4%     2357Mi          109%
CreationTimestamp:  Fri, 23 Jul 2021 09:16:07 -0500
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  aks-web-30425474-vmss000007
  AcquireTime:     <unset>
  RenewTime:       Mon, 30 Aug 2021 09:06:12 -0500
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Fri, 23 Jul 2021 09:16:41 -0500   Fri, 23 Jul 2021 09:16:41 -0500   RouteCreated                 RouteController created a route
  MemoryPressure       False   Mon, 30 Aug 2021 09:04:12 -0500   Fri, 23 Jul 2021 09:16:07 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Mon, 30 Aug 2021 09:04:12 -0500   Fri, 23 Jul 2021 09:16:07 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Mon, 30 Aug 2021 09:04:12 -0500   Fri, 23 Jul 2021 09:16:07 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Mon, 30 Aug 2021 09:04:12 -0500   Fri, 23 Jul 2021 09:16:17 -0500   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  Hostname:    aks-web-30425474-vmss000007
  InternalIP:  ***
Capacity:
  attachable-volumes-azure-disk:  4
  cpu:                            2
  ephemeral-storage:              129900528Ki
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         4030180Ki
  pods:                           110
Allocatable:
  attachable-volumes-azure-disk:  4
  cpu:                            1900m
  ephemeral-storage:              119716326407
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         2213604Ki
  pods:                           110
System Info:
  Machine ID:                 ***
  System UUID:                ***
  Boot ID:                    ***
  Kernel Version:             5.4.0-1049-azure
  OS Image:                   Ubuntu 18.04.5 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.4+azure
  Kubelet Version:            v1.20.5
  Kube-Proxy Version:         v1.20.5
{"kind":"NodeMetrics","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"name":"aks-web-30425474-vmss000007","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/aks-web-30425474-vmss000007","creationTimestamp":"2021-08-30T14:06:22Z"},"timestamp":"2021-08-30T14:05:45Z","window":"30s","usage":{"cpu":"78528126n","memory":"2413824Ki"}}

@AnthonyWC
Copy link

Another data point (on AWS EKS):

Seems like happen only on smaller nodetypes (I see it with both micro/nano, others are medium)

NAME                                           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
ip-x-x-x-x.us-west-2.compute.internal           117m         6%     864Mi           25%
ip-x-x-x-x.us-west-2.compute.internal            45m         2%     700Mi           128%
ip-x-x-x-x.us-west-2.compute.internal           118m         6%     1265Mi          37%

Usage is displayed correctly with kubectl describe node

Labels:             alpha.eksctl.io/nodegroup-name=nodegroup-spot-t4g-micro
                    beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=t4g.micro
                    beta.kubernetes.io/os=linux
                    eks.amazonaws.com/capacityType=SPOT
                    eks.amazonaws.com/nodegroup-image=ami-08b947bfaa7dc4093
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-0a1422f9743d31441
                    eks.amazonaws.com/sourceLaunchTemplateVersion=1
                    failure-domain.beta.kubernetes.io/region=us-west-2
                    failure-domain.beta.kubernetes.io/zone=us-west-2b
                    kubernetes.io/arch=arm64
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=t4g.micro
                    topology.kubernetes.io/region=us-west-2
                    topology.kubernetes.io/zone=us-west-2b
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 28 Sep 2021 14:28:32 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-192-168-60-44.us-west-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Wed, 29 Sep 2021 17:23:20 -0400
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Wed, 29 Sep 2021 17:18:45 -0400   Tue, 28 Sep 2021 14:28:30 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 29 Sep 2021 17:18:45 -0400   Tue, 28 Sep 2021 14:28:30 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 29 Sep 2021 17:18:45 -0400   Tue, 28 Sep 2021 14:28:30 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Wed, 29 Sep 2021 17:18:45 -0400   Tue, 28 Sep 2021 14:28:52 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  <redacted>
Capacity:
  attachable-volumes-aws-ebs:  39
  cpu:                         2
  ephemeral-storage:           83864556Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  hugepages-32Mi:              0
  hugepages-64Ki:              0
  memory:                      968424Ki
  pods:                        4
Allocatable:
  attachable-volumes-aws-ebs:  39
  cpu:                         1930m
  ephemeral-storage:           76215832858
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  hugepages-32Mi:              0
  hugepages-64Ki:              0
  memory:                      559848Ki
  pods:                        4
System Info:
  Machine ID:                 **
  System UUID:                **
  Boot ID:                    **
  Kernel Version:             5.4.141-67.229.amzn2.aarch64
  OS Image:                   Amazon Linux 2
  Operating System:           linux
  Architecture:               arm64
  Container Runtime Version:  docker://19.3.13
  Kubelet Version:            v1.21.2-eks-55daa9d
  Kube-Proxy Version:         v1.21.2-eks-55daa9d
ProviderID:                   aws:///us-west-2b/i-0a3b1e6e4c0b5a1a6
Non-terminated Pods:          (3 in total)
  Namespace                   Name                                 CPU Requests  CPU Limits   Memory Requests  Memory Limits  Age
  ---------                   ----                                 ------------  ----------   ---------------  -------------  ---
  kube-system                 aws-node-wgnm6                   10m (0%)        0 (0%)           0 (0%)               0 (0%)         26h
  kube-system                 kube-proxy-xxl86                 100m (5%)        0 (0%)           0 (0%)                0 (0%)         26h
  ##                           ##-7dcb4c47ff-t5mbp            1740m (90%)   1880m (97%)   480Mi (87%)      540Mi (98%)    26h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         1850m (95%)  1880m (97%)
  memory                      480Mi (87%)  540Mi (98%)
  ephemeral-storage           0 (0%)       0 (0%)
  hugepages-1Gi               0 (0%)       0 (0%)
  hugepages-2Mi               0 (0%)       0 (0%)
  hugepages-32Mi              0 (0%)       0 (0%)
  hugepages-64Ki              0 (0%)       0 (0%)
  attachable-volumes-aws-ebs  0            0
Events:                       <none>

NodeMetrics:

  "usage": {
    "cpu": "45283753n",
    "memory": "717664Ki"
  }

@serathius
Copy link
Contributor

serathius commented Sep 30, 2021

Why kubelet node memory utilization can exceed 100% (same can happen for CPU)? Simple, because it's not utilization of physical device, but utilization of resources allocated for pods and system daemons.

How utilization of resources allocated is calculated? Simple, it's sum of resources used by on node divided by all resources allocated.

How Kubelet allocates resources for pods? It takes all resources available on VM and substracts resources that it reserves for itself, kernel etc.

How Kubelet knows how much resources it needs to reserve? User provides it in flags.

There is no error, there is no Kubernetes magic that gets us over 100%, it's just question of how you define utilization. Kubelet reserves some resources for system, they are not included when calculating node utilization, which means it can go above 100% when pods start using reserved resources (this is what over-committing means in "kubectl describe node" output).

Some math based on your kubectl describe node output
Node capacity: 968424Ki
Node allocatable: 559848Ki
Node Usage 717664Ki

Node utilization: Node usage / node allocatable = 717664Ki / 559848Ki = 128%
VM utilization: Node usage / node capacity = 717664Ki / 968424Ki = 74%

@dashpole
Copy link
Contributor

dashpole commented Oct 4, 2021

Based on my reading of the metrics server implementation, I don't think "Node Usage" include only usage from pods. The node_memory_working_set_bytes includes usage by system daemons, so I'm not sure it makes sense to compare to allocatable, which is meant to exclude system daemons.

@serathius
Copy link
Contributor

Hmm, I was not aware of that. This is a good point, as utilization was implemented in kubectl by different SIG (CLI), it's possible that it was not properly reviewed by other stakeholders (SIG Instrumentation/SIG Node). I would make sense to revisit what should be displayed as node utilization.

k8s-publishing-bot pushed a commit to kubernetes/kubectl that referenced this issue Nov 5, 2021
…l node memory total usage.

If "Allocatable" is used to a node total memory size, under high memory pressure or pre-reserved memory value is bigger, the "MEMORY%" can be bigger than 100%.
For suppressing the confusing, add a option to show node real memory usage based on "Capacity".
* Reference: kubernetes/kubernetes#86499

Kubernetes-commit: 862937bf1c7975d3f54ae47a2958e47f2c50150f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.