New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need simple kubectl command to see cluster resource usage #17512
Comments
Something along the lines of (2) seems reasonable, though the UX folks would know better than me. (3) seems vaguely related to #15743 but I'm not sure they're close enough to combine. |
In addition to the case above, it would be nice to see what resource utilization we're getting.
In this example, the aggregate container requests are 4.455 cores and 20.1 GiB and there are 5 cores and 30GiB total in the cluster. |
There is:
|
I use below command to get a quick view for the resource usage. It is the simplest way I found.
|
If there was a way to "format" the output of |
here is my hack |
@from-nibly thanks, just what i was looking for |
Yup, this is mine:
|
@goltermann There are no sig labels on this issue. Please add a sig label by: |
@kubernetes/sig-cli-misc |
You can use the below command to find the percentage cpu utlisation of your nodes
|
@tomfotherby |
@alok87 - Thanks for your aliases. In my case, this is what worked for me given that we use
|
#17512 (comment) AFAICT, there's no easy way to get a report of node CPU allocation by pod, since requests are per container in the spec. And even then, it's difficult since |
/cc @misterikkit |
Getting in on this shell scripting party. I have an older cluster running the CA with scale down disabled. I wrote this script to determine roughly how much I can scale down the cluster when it starts to bump up on its AWS route limits: #!/bin/bash
set -e
KUBECTL="kubectl"
NODES=$($KUBECTL get nodes --no-headers -o custom-columns=NAME:.metadata.name)
function usage() {
local node_count=0
local total_percent_cpu=0
local total_percent_mem=0
local readonly nodes=$@
for n in $nodes; do
local requests=$($KUBECTL describe node $n | grep -A2 -E "^\\s*CPU Requests" | tail -n1)
local percent_cpu=$(echo $requests | awk -F "[()%]" '{print $2}')
local percent_mem=$(echo $requests | awk -F "[()%]" '{print $8}')
echo "$n: ${percent_cpu}% CPU, ${percent_mem}% memory"
node_count=$((node_count + 1))
total_percent_cpu=$((total_percent_cpu + percent_cpu))
total_percent_mem=$((total_percent_mem + percent_mem))
done
local readonly avg_percent_cpu=$((total_percent_cpu / node_count))
local readonly avg_percent_mem=$((total_percent_mem / node_count))
echo "Average usage: ${avg_percent_cpu}% CPU, ${avg_percent_mem}% memory."
}
usage $NODES Produces output like:
|
There is also pod option in top command:
|
My way to obtain the allocation, cluster-wide:
It produces something like:
|
This is weird. I want to know when I'm at or nearing allocation capacity. It seems a pretty basic function of a cluster. Whether it's a statistic that shows a high % or textual error... how do other people know this? Just always use autoscaling on a cloud platform? |
I authored https://github.com/dpetzold/kube-resource-explorer/ to address #3. Here is some sample output:
|
Hello! I created this script and share it with you. https://github.com/Sensedia/open-tools/blob/master/scripts/listK8sHardwareResources.sh This script has a compilation of some of the ideas you shared here. The script can be incremented and can help other people get the metrics more simply. Thanks for sharing the tips and commands! |
For my use case, I ended up writing a simple Its more of a convenience thing than anything else, but maybe someone else will find it useful too. |
Whoa, what a huge thread and still no proper solution from kubernetes team to properly calculate current overall cpu usage of a whole cluster? |
For those looking to run this on minikube , first enable the metric server add-on |
If you're using Krew: kubectl krew install resource-capacity
kubectl resource-capacity
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
5 years and still open. I understand there are loads of tools available to check pod resource usage but honestly why not supply a standard one out of the box that's simple to use? Bundling grafana and prometheus with all the monitoring you could require would have been a god send for my team. We wasted months experimenting with different solutions. Please kube mainteners give us something out of the box and close this issue! |
/remove-lifecycle stale |
Even with all the tools above (I currently use |
Highly recommend another great tool called K9S: https://github.com/derailed/k9s It's a separate CLI tool but uses the same config context for access and offers a lot of terminal/UI utility for monitoring and managing your cluster. |
|
From the long history of comments here it seems everyone has different expectations by the many issues and requests reported in this thread. This thread is more of a wiki now. We'd be happy to see one of these plugins be proposed for upstreaming via a KEP. If someone wants to own this and bias for action with a decision, please open a KEP for discussion. /close |
@eddiezane: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In case folks are still listening in on this issue...Has anyone attempted using the standard resource usage apis like getrusage() for sw running inside containers/pods(). For cpu stats it does not seem that it will far off from what the node level cgroup would have to report. mem stats seem more problematic. Unclear whether say /sys/fs/crgoup/memory/<> from inside a container really reflects memory usage correctly. Being able to monitor the resource usage from within an app (and then changing behavior in the app etc.) is a neat capability. Seems unclear when that will be available in k8s so casting around for workarounds. |
another tool to see resources node-wise, namespace-wise: https://github.com/dguyhasnoname/k8s-day2-ops/tree/master/resource_calcuation/k8s-toppur |
My hack (on k8s 1.18; EKS) kubectl describe nodes | grep 'Name:\|Allocated' -A 5 | grep 'Name\|memory' |
Lots of gems in this thread, :) thanks all! (wish some good writer could summarize and publish a quick sheet for it) |
@jackdpeterson answer adapted for Powershell :)
|
Without counting the lines
|
It's not perfect, but we can get a serviceable summary with :; kubectl describe nodes |
sed -n '/^Allocated /,/^Events:/ { /^ [^(]/ p; } ; /^Name: / p'
Name: ip100.k8s.computer
Resource Requests Limits
-------- -------- ------
cpu 6773m (90%) 14300m (190%)
memory 12851005952 (40%) 18577645056 (57%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Name: ip200.k8s.computer
Resource Requests Limits
-------- -------- ------
cpu 7082m (94%) 9500m (126%)
memory 26405455360 (83%) 24630806144 (77%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Name: ip300.k8s.computer
Resource Requests Limits
-------- -------- ------
cpu 7153m (95%) 8800m (117%)
memory 27759605888 (86%) 22996783232 (71%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%) |
Below script only works in the
|
Updated version of this shell function function kusage() {
# Function returning resources usage on current kubernetes cluster
local node_count=0
local total_percent_cpu=0
local total_percent_mem=0
echo "NODE\t\t CPU_allocatable\t Memory_allocatable\t CPU_requests%\t Memory_requests%\t CPU_limits%\t Memory_limits%\t"
for n in $(kubectl get nodes --no-headers -o custom-columns=NAME:.metadata.name); do
local requests=$(kubectl describe node $n | grep -A2 -E "Resource" | tail -n1 | tr -d '(%)')
local abs_cpu=$(echo $requests | awk '{print $2}')
local percent_cpu=$(echo $requests | awk '{print $3}')
local node_cpu=$(echo $abs_cpu $percent_cpu | tr -d 'mKi' | awk '{print int($1/$2*100)}')
local allocatable_cpu=$(echo $node_cpu $abs_cpu | tr -d 'mKi' | awk '{print int($1 - $2)}')
local percent_cpu_lim=$(echo $requests | awk '{print $5}')
local requests=$(kubectl describe node $n | grep -A3 -E "Resource" | tail -n1 | tr -d '(%)')
local abs_mem=$(echo $requests | awk '{print $2}')
local percent_mem=$(echo $requests | awk '{print $3}')
local node_mem=$(echo $abs_mem $percent_mem | tr -d 'mKi' | awk '{print int($1/$2*100)}')
local allocatable_mem=$(echo $node_mem $abs_mem | tr -d 'mKi' | awk '{print int($1 - $2)}')
local percent_mem_lim=$(echo $requests | awk '{print $5}')
echo "$n\t ${allocatable_cpu}m\t\t\t ${allocatable_mem}Ki\t\t ${percent_cpu}%\t\t ${percent_mem}%\t\t\t ${percent_cpu_lim}%\t\t ${percent_mem_lim}%\t"
node_count=$((node_count + 1))
total_percent_cpu=$((total_percent_cpu + percent_cpu))
total_percent_mem=$((total_percent_mem + percent_mem))
done
local avg_percent_cpu=$((total_percent_cpu / node_count))
local avg_percent_mem=$((total_percent_mem / node_count))
echo "Average usage (requests) : ${avg_percent_cpu}% CPU, ${avg_percent_mem}% memory."
}
|
Users are getting tripped up by pods not being able to schedule due to resource deficiencies. It can be hard to know when a pod is pending because it just hasn't started up yet, or because the cluster doesn't have room to schedule it. http://kubernetes.io/v1.1/docs/user-guide/compute-resources.html#monitoring-compute-resource-usage helps, but isn't that discoverable (I tend to try a 'get' on a pod in pending first, and only after waiting a while and seeing it 'stuck' in pending, do I use 'describe' to realize it's a scheduling problem).
This is also complicated by system pods being in a namespace that is hidden. Users forget that those pods exist, and 'count against' cluster resources.
There are several possible fixes offhand, I don't know what would be ideal:
Develop a new pod state other than Pending to represent "tried to schedule and failed for lack of resources".
Have kubectl get po or kubectl get po -o=wide display a column to detail why something is pending (perhaps the container.state that is Waiting in this case, or the most recent event.message).
Create a new kubectl command to more easily describe resources. I'm imagining a "kubectl usage" that gives an overview of total cluster CPU and Mem, per node CPU and Mem and each pod/container's usage. Here we would include all pods, including system ones. This might be useful long term alongside more complex schedulers, or when your cluster has enough resources but no single node does (diagnosing the 'no holes large enough' problem).
The text was updated successfully, but these errors were encountered: