kubernetes is CPU-hungry on minikube #48948

rcorre · 2017-07-14T16:38:58Z

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:

When running kubernetes 1.7 on minikube, CPU usage for the VBoxHeadless process is constantly around 100%.

What you expected to happen:

CPU usage closer to 10%, as it is when running 1.6.4

How to reproduce it (as minimally and precisely as possible):

minikube start --kubernetes-version v1.7.0
top
minikube stop
minikube start --kubernetes-version v1.6.4
top

Anything else we need to know?:

This could be a minikube bug, but since CPU usage changes drastically between kubernetes versions (but the same minikube version), I figured it might be a kubernetes issue.

Environment:

Kubernetes version (use kubectl version): 1.7.0 vs 1.6.4
Cloud provider or hardware configuration**: local
OS (e.g. from /etc/os-release): ArchLinux
Kernel (e.g. uname -a): 4.9.33-1-lts
Install tools: huh?
Others: minikube version: v0.20.0

The text was updated successfully, but these errors were encountered:

rcorre · 2017-07-14T18:09:49Z

/sig scalability

Really no idea if that is the right group, but none seemed completely appropriate

timbunce · 2017-11-22T15:47:24Z

See kubernetes/minikube#1158.

fejta-bot · 2018-02-20T16:37:42Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

daveoconnor · 2018-02-20T16:51:34Z

/remove-lifecycle stale

fejta-bot · 2018-05-21T17:27:10Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-06-20T18:13:28Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

timbunce · 2018-06-20T19:05:03Z

This is still a problem with the minikube v0.28.0.
For example, on my macOS 10.13.5, VBoxHeadless uses ~50% cpu with nothing running.
That's still far above the 10% as it was when running 1.6.4.

Could someone change the title to remove the reference to 1.7?

/remove-lifecycle stale

timbunce · 2018-06-21T13:43:10Z

/remove-lifecycle rotten

GregSilverman · 2018-06-28T21:36:24Z

Same issue with v0.26.0.

yeluolei · 2018-07-05T11:33:36Z

same issue in mac with 0.28

michilu · 2018-07-23T11:45:17Z

Same issue with v0.28.1.

ypresto · 2018-07-27T02:44:18Z

Maybe related; issue about CPU usage of docker-for-mac with builtin Kubernetes:
docker/for-mac#2601

briandealwis · 2018-08-23T14:09:08Z

I see this high CPU (~30% at idle) with MacOS both with Docker for Mac's bundled Kubernetes and with minikube.

@lizrice described an approach for installing k8s using Vagrant without minikube, but still with VirtualBox. She noted that she was still seeing high CPU usage. minikube may not be the issue here.

briandealwis · 2018-08-24T16:11:56Z

I tried using minikube with the new docker-machine-driver-vmware driver (kubernetes/minikube#2606) to start in VMWare Fusion, to see if this usage was something tied to VirtualBox or hyper kit, and it has the same ~30% CPU usage, though with a bit more jitter. I tried with both Kubernetes v1.10.0 and v1.11.2.

alsuren · 2018-08-27T15:23:08Z

In an attempt to track this down, I ran:

minikube start && eval $(minikube docker-env) && docker run --rm -it --pid host frapsoft/htop --delay=100 (using F2 -> Display options -> Hide userland process threads)

and found that kube-controller-manager is consistently using 4-5% CPU even when it's not doing anything. Full output (cmd-a cmd-c) is pasted below.

$ docker run --rm -it --pid host frapsoft/htop --delay=100


  1  [||||                                                            5.4%]   Tasks: 64; 1 running
  2  [||||                                                            5.3%]   Load average: 0.42 0.26 0.15 
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||1.23G/1.94G]   Uptime: 00:13:50
  Swp[                                                            0K/1000M]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 2608 root       20   0  146M 81168 45888 S  4.8  4.0  0:34.14 kube-controller-manager --address=127.0.0.1 --controllers=*,bootstrapsigner,tokencleaner --se
 2575 root       20   0  428M  263M 54700 S  3.2 13.2  0:34.80 kube-apiserver --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount
 2279 root       20   0 1141M 86676 52956 S  2.9  4.3  0:25.75 /usr/bin/kubelet --cgroup-driver=cgroupfs --kubeconfig=/etc/kubernetes/kubelet.conf --bootstr
 2662 root       20   0 10.1G 56960 30132 S  1.5  2.8  0:13.48 etcd --peer-cert-file=/var/lib/localkube/certs/etcd/peer.crt --peer-key-file=/var/lib/localku
 1865 root       20   0  413M 68624 29744 S  1.2  3.4  0:17.85 /usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --tlsverify --tlscacert
 2630 root       20   0 44900 32052 22220 S  1.1  1.6  0:07.21 kube-scheduler --address=127.0.0.1 --leader-elect=true --kubeconfig=/etc/kubernetes/scheduler
 3748 root       20   0 46164 31096 22628 S  0.3  1.5  0:01.95 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
 1871 root       20   0  842M 19020 13384 S  0.3  0.9  0:02.10 docker-containerd --config /var/run/docker/containerd/containerd.toml
 3333 nobody     20   0 36076 24048 14888 S  0.2  1.2  0:01.09 /sidecar --v=2 --logtostderr --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.l
 7946 app        20   0  4088  1716   944 R  0.1  0.1  0:00.20 htop --delay=100
 4123 root       20   0 38952 23016 16760 S  0.1  1.1  0:00.56 /kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
 4348 root       20   0 37788 27920 16524 S  0.1  1.4  0:00.53 /dashboard --insecure-bind-address=0.0.0.0 --bind-address=0.0.0.0
 4214 root       20   0  146M 55372 42208 S  0.0  2.7  0:00.41 /storage-provisioner
 1692 root       20   0  535M 19312 12012 S  0.0  0.9  0:00.14 /usr/bin/rkt api-service
 3245 root       20   0 32056 18768 13664 S  0.0  0.9  0:00.16 /dnsmasq-nanny -v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -restartDnsmasq=true -
 2625 root       20   0 12228  8328  3708 S  0.0  0.4  0:00.29 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containe
 7931 root       20   0  8928  3296  2812 S  0.0  0.2  0:00.04 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containe
F1Help  F2Setup F3SearchF4FilterF5Tree  F6SortByF7Nice -F8Nice +F9Kill  F10Quit

I assume that the next step is be to follow https://github.com/kubernetes/community/blob/master/contributors/devel/profiling.md and instrument each of the processes in turn, to find out what they're doing when they should be idle?

fejta-bot · 2018-11-25T15:46:03Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Laski · 2018-12-07T21:20:04Z

/remove-lifecycle stale

This is still happening to me in v0.30.0

timbunce · 2018-12-08T20:17:27Z

@rcorre, or someone with permission, please edit the title to remove the reference to 1.7.

Laski · 2018-12-10T19:35:30Z

Is there any information I can give to help debugging this?

erulabs · 2019-02-25T20:24:08Z

Can someone from sig/scalability take a look at this? I would love to help out - can test, reproduce, explore, report... This is becoming quite a blocker for local Kubernetes development!

fejta-bot · 2019-05-26T21:21:33Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

antonmarin · 2019-05-27T07:29:46Z

/remove-lifecycle stale

leahciMic · 2019-06-17T14:18:10Z

Wondering what the best way to develop locally for kubernetes? The base load and slowness of minikube is a deal breaker for rapid development especially when using a laptop.

I have thought about:

Running it on a server dedicated to this. This could be a potential solution to run it on a server on my network, but it has it's own issues like being more difficult to update docker images etc, or developing from a different network, (for example, coffee shop).
Running services on my machine (elasticsearch, mysql, redis, etc) and using them in develop, and then using containerized services in production. I have already discounted this idea, as I would like to have my develop environment match my production environment.
Run with vm-driver=none

But all of these issues have there own problems.

I assume that with the popularity of kubernetes and minikube that there must be a way to run a local cluster without the level of resource consumption that many people are seeing.

Is there an alternative to minikube that has lower overheads?

Basically an idle mysql server on my laptop is no big deal, and I don't notice it. An idling minikube cluster (without any services running mind you), is definitely felt.

What can I do to help?

dims · 2019-06-17T14:21:24Z

@leahciMic did you try kind? https://github.com/kubernetes-sigs/kind

almike · 2019-08-14T02:14:13Z

I got 30-40% constant usage with minikube, 10-20% with microk8s.
I am new to kube, not sure where to start, but I could dig some logs to help.
would be nice to see this solved as we need to run ingress locally on mac and the only way to do so is to have kube running.

fejta-bot · 2019-11-12T02:45:18Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

cobalamin · 2019-11-12T13:25:56Z

/remove-lifecycle stale

StefPac · 2019-11-24T16:57:08Z

I am experiencing the same problem with minikube v1.5.2 on Darwin 10.14.6
and Kubernetes v1.16.2 on Docker '18.09.9' using Oracle VM VirtualBox Headless Interface 6.0.14

ankurkanoria · 2019-12-29T12:31:10Z

Similar issue, vastly hinders local development agility. Setup a new barebones single node development cluster using kubeadm (v1.17) inside a reasonably beefy vm 4 cores 4gb ram (with kvm tuning like raw disk virtio). Idle state cpu usage is around 30%, spikes to 50-60% during container creation/deletion. Controller manager and scheduler often timeout waiting for Apiserver and go to crashloopbackoff state (40+ restarts after 10 hours of ~idling). Similar issues across a range of hardware configurations on different Linux distributions.

While searching around, came across quite a few stackoverflow posts and github issues complaining of similar observations. While I understand that kube is optimized for running at scale, it does - non-trivially - hurt local development.

What will be great if there are some resources pointing us towards configuration changes that reduce resource consumption at the expense of reducing cluster reliability, e.g. health check/poll frequencies, etc. As developers working on local machines, many of us might be happy to take that tradeoff.

nyrahul · 2020-02-16T17:47:49Z

Same issue (100% cpu for vboxheadless)

minikube v1.6.1 on Ubuntu 18.04 arch x86_64
Linux Kernel 5.3.0
K8s v1.17.0
vboxheadless v6.1.2

Attached is the perf report for vboxheadless (recorded with few secs while vboxheadless is at 100% CPU).
vboxheadless-perf-report.txt

sunshine69 · 2020-03-08T23:45:34Z

Hm seeing this thread is a bit painfull. People keep posting, reporting but seems no one to care. Issues keep coming stale.

I probably run away from the whole kubeXXX all together for such behaviour.

Oh well maybe a attitude - submit a patch or shut up?

sunshine69 · 2020-03-08T23:50:25Z

And if someone ask me why I have nothing to do but posting here, then you can see below. But looking above makes me even not bothering to report anything ....

Wake up people. If you do so, you lose users, your project will be less and less popular and eventually phased out ......

top - 09:48:30 up 1 day, 12:29,  1 user,  load average: 0.38, 0.79, 0.94
Tasks: 340 total,   1 running, 339 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  1.2 sy,  0.0 ni, 94.6 id,  0.8 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :  15891.9 total,   1112.2 free,   6238.8 used,   8540.9 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   8825.4 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                   
 3333 libvirt+  20   0 8822744   2.7g  20904 S     26.2  17.3   3:25.87 qemu-system-x86                           
 3474 stevek    20   0 1202132  98308  52932 S   2.3   0.6   9:46.95 Xorg                                      
 4153 stevek    20   0  975464  53264  37600 S   2.0   0.3   0:07.13 gnome-terminal-                           
32747 stevek    20   0 9003620 244924  90080 S   1.3   1.5   1:20.64 chrome                                    
 3748 stevek    20   0 3863612 283392  98096 S   1.0   1.7  14:22.97 gnome-shell                               
 5367 stevek    20   0 1756812 528440 314792 S   0.7   3.2  26:00.70 chrome

tstromberg · 2020-05-07T06:09:07Z

FWIW, minikube has vastly improved CPU overhead since v1.8 - a reduction of 19% in 2020 alone.

On a modern developer machine like a MacBook Pro, minikube with hyperkit now consumes roughly 6% of available system CPU (30-40% of a single CPU core): https://docs.google.com/spreadsheets/d/1qzgVsZ9y0zqCjoQlN_LGJH3MUMqrVPexzNhdB2jzBqU/edit#gid=1614668143

It isn't perfect, but I feel like this issue can be closed, or at least moved to the minikube repo. In 2020, we'll be focusing our eyes on reducing usage in apiserver and etcd, which is where most of minikube's CPU cycles are now spent.

ravihugo · 2020-07-02T16:17:10Z

@tstromberg That is great news. I hope the apiserver/etcd usage can be lowered!

oomichi · 2020-09-04T20:10:36Z

As #48948 (comment) it seems fine to close this issue at this time.
In addition, this issue is for minikube. It is better to open an issue on https://github.com/kubernetes/minikube/issues to get more attention from minikube experts if necessary to re-open again.

/close

k8s-ci-robot · 2020-09-04T20:10:49Z

@oomichi: Closing this issue.

In response to this:

As #48948 (comment) it seems fine to close this issue at this time.
In addition, this issue is for minikube. It is better to open an issue on https://github.com/kubernetes/minikube/issues to get more attention from minikube experts if necessary to re-open again.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Pictor13 · 2021-04-10T04:49:12Z

It isn't perfect, but I feel like this issue can be closed, or at least moved to the minikube repo. In 2020, we'll be focusing our eyes on reducing usage in apiserver and etcd, which is where most of minikube's CPU cycles are now spent.

Thank you for your job guys!
Sorry for commenting on a closed issue @tstromberg , but I was wondering if there's a specific Kubernetes issue on the tracker that we can follow to see updates regarding these improvements to apiserver and etcd.
I tried to search on github issues, but couldn't find the right keywords 😅

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 14, 2017

k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 14, 2017

k8s-ci-robot added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Jul 14, 2017

k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 14, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 20, 2018

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 21, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 25, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 7, 2018

rcorre changed the title ~~1.7 is CPU-hungry on minikube~~ kubernetes is CPU-hungry on minikube Dec 8, 2018

andreialecu mentioned this issue Feb 22, 2019

VM has 50% resting CPU usage when idle kubernetes/minikube#3207

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 26, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 27, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2019

k8s-ci-robot closed this as completed Sep 4, 2020

kubernetes is CPU-hungry on minikube #48948

kubernetes is CPU-hungry on minikube #48948

Comments

rcorre commented Jul 14, 2017

rcorre commented Jul 14, 2017

timbunce commented Nov 22, 2017

fejta-bot commented Feb 20, 2018

daveoconnor commented Feb 20, 2018

fejta-bot commented May 21, 2018

fejta-bot commented Jun 20, 2018

timbunce commented Jun 20, 2018 • edited Loading

timbunce commented Jun 21, 2018

GregSilverman commented Jun 28, 2018

yeluolei commented Jul 5, 2018

michilu commented Jul 23, 2018

ypresto commented Jul 27, 2018 • edited Loading

briandealwis commented Aug 23, 2018

briandealwis commented Aug 24, 2018

alsuren commented Aug 27, 2018

fejta-bot commented Nov 25, 2018

Laski commented Dec 7, 2018

timbunce commented Dec 8, 2018

Laski commented Dec 10, 2018

erulabs commented Feb 25, 2019

fejta-bot commented May 26, 2019

antonmarin commented May 27, 2019

leahciMic commented Jun 17, 2019 • edited Loading

dims commented Jun 17, 2019

almike commented Aug 14, 2019

fejta-bot commented Nov 12, 2019

cobalamin commented Nov 12, 2019

StefPac commented Nov 24, 2019

ankurkanoria commented Dec 29, 2019

nyrahul commented Feb 16, 2020 • edited Loading

sunshine69 commented Mar 8, 2020

sunshine69 commented Mar 8, 2020 • edited Loading

tstromberg commented May 7, 2020

ravihugo commented Jul 2, 2020

oomichi commented Sep 4, 2020

k8s-ci-robot commented Sep 4, 2020

Pictor13 commented Apr 10, 2021 • edited Loading

timbunce commented Jun 20, 2018 •

edited

Loading

ypresto commented Jul 27, 2018 •

edited

Loading

leahciMic commented Jun 17, 2019 •

edited

Loading

nyrahul commented Feb 16, 2020 •

edited

Loading

sunshine69 commented Mar 8, 2020 •

edited

Loading

Pictor13 commented Apr 10, 2021 •

edited

Loading