Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes is CPU-hungry on minikube #48948

Closed
rcorre opened this issue Jul 14, 2017 · 37 comments
Closed

kubernetes is CPU-hungry on minikube #48948

rcorre opened this issue Jul 14, 2017 · 37 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.

Comments

@rcorre
Copy link
Contributor

rcorre commented Jul 14, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:

When running kubernetes 1.7 on minikube, CPU usage for the VBoxHeadless process is constantly around 100%.

What you expected to happen:

CPU usage closer to 10%, as it is when running 1.6.4

How to reproduce it (as minimally and precisely as possible):

minikube start --kubernetes-version v1.7.0
top
minikube stop
minikube start --kubernetes-version v1.6.4
top

Anything else we need to know?:

This could be a minikube bug, but since CPU usage changes drastically between kubernetes versions (but the same minikube version), I figured it might be a kubernetes issue.

Environment:

  • Kubernetes version (use kubectl version): 1.7.0 vs 1.6.4
  • Cloud provider or hardware configuration**: local
  • OS (e.g. from /etc/os-release): ArchLinux
  • Kernel (e.g. uname -a): 4.9.33-1-lts
  • Install tools: huh?
  • Others: minikube version: v0.20.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 14, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 14, 2017
@rcorre
Copy link
Contributor Author

rcorre commented Jul 14, 2017

/sig scalability

Really no idea if that is the right group, but none seemed completely appropriate

@k8s-ci-robot k8s-ci-robot added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Jul 14, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 14, 2017
@timbunce
Copy link

See kubernetes/minikube#1158.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2018
@daveoconnor
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 20, 2018
@timbunce
Copy link

timbunce commented Jun 20, 2018

This is still a problem with the minikube v0.28.0.
For example, on my macOS 10.13.5, VBoxHeadless uses ~50% cpu with nothing running.
That's still far above the 10% as it was when running 1.6.4.

Could someone change the title to remove the reference to 1.7?

/remove-lifecycle stale

@timbunce
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 21, 2018
@GregSilverman
Copy link

Same issue with v0.26.0.

@yeluolei
Copy link

yeluolei commented Jul 5, 2018

same issue in mac with 0.28

@michilu
Copy link

michilu commented Jul 23, 2018

Same issue with v0.28.1.

@ypresto
Copy link

ypresto commented Jul 27, 2018

Maybe related; issue about CPU usage of docker-for-mac with builtin Kubernetes:
docker/for-mac#2601

@briandealwis
Copy link

I see this high CPU (~30% at idle) with MacOS both with Docker for Mac's bundled Kubernetes and with minikube.

@lizrice described an approach for installing k8s using Vagrant without minikube, but still with VirtualBox. She noted that she was still seeing high CPU usage. minikube may not be the issue here.

@briandealwis
Copy link

I tried using minikube with the new docker-machine-driver-vmware driver (kubernetes/minikube#2606) to start in VMWare Fusion, to see if this usage was something tied to VirtualBox or hyper kit, and it has the same ~30% CPU usage, though with a bit more jitter. I tried with both Kubernetes v1.10.0 and v1.11.2.

@alsuren
Copy link

alsuren commented Aug 27, 2018

In an attempt to track this down, I ran:

minikube start && eval $(minikube docker-env) && docker run --rm -it --pid host frapsoft/htop --delay=100 (using F2 -> Display options -> Hide userland process threads)

and found that kube-controller-manager is consistently using 4-5% CPU even when it's not doing anything. Full output (cmd-a cmd-c) is pasted below.

$ docker run --rm -it --pid host frapsoft/htop --delay=100


  1  [||||                                                            5.4%]   Tasks: 64; 1 running
  2  [||||                                                            5.3%]   Load average: 0.42 0.26 0.15 
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||1.23G/1.94G]   Uptime: 00:13:50
  Swp[                                                            0K/1000M]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 2608 root       20   0  146M 81168 45888 S  4.8  4.0  0:34.14 kube-controller-manager --address=127.0.0.1 --controllers=*,bootstrapsigner,tokencleaner --se
 2575 root       20   0  428M  263M 54700 S  3.2 13.2  0:34.80 kube-apiserver --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount
 2279 root       20   0 1141M 86676 52956 S  2.9  4.3  0:25.75 /usr/bin/kubelet --cgroup-driver=cgroupfs --kubeconfig=/etc/kubernetes/kubelet.conf --bootstr
 2662 root       20   0 10.1G 56960 30132 S  1.5  2.8  0:13.48 etcd --peer-cert-file=/var/lib/localkube/certs/etcd/peer.crt --peer-key-file=/var/lib/localku
 1865 root       20   0  413M 68624 29744 S  1.2  3.4  0:17.85 /usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --tlsverify --tlscacert
 2630 root       20   0 44900 32052 22220 S  1.1  1.6  0:07.21 kube-scheduler --address=127.0.0.1 --leader-elect=true --kubeconfig=/etc/kubernetes/scheduler
 3748 root       20   0 46164 31096 22628 S  0.3  1.5  0:01.95 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
 1871 root       20   0  842M 19020 13384 S  0.3  0.9  0:02.10 docker-containerd --config /var/run/docker/containerd/containerd.toml
 3333 nobody     20   0 36076 24048 14888 S  0.2  1.2  0:01.09 /sidecar --v=2 --logtostderr --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.l
 7946 app        20   0  4088  1716   944 R  0.1  0.1  0:00.20 htop --delay=100
 4123 root       20   0 38952 23016 16760 S  0.1  1.1  0:00.56 /kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
 4348 root       20   0 37788 27920 16524 S  0.1  1.4  0:00.53 /dashboard --insecure-bind-address=0.0.0.0 --bind-address=0.0.0.0
 4214 root       20   0  146M 55372 42208 S  0.0  2.7  0:00.41 /storage-provisioner
 1692 root       20   0  535M 19312 12012 S  0.0  0.9  0:00.14 /usr/bin/rkt api-service
 3245 root       20   0 32056 18768 13664 S  0.0  0.9  0:00.16 /dnsmasq-nanny -v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -restartDnsmasq=true -
 2625 root       20   0 12228  8328  3708 S  0.0  0.4  0:00.29 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containe
 7931 root       20   0  8928  3296  2812 S  0.0  0.2  0:00.04 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containe
F1Help  F2Setup F3SearchF4FilterF5Tree  F6SortByF7Nice -F8Nice +F9Kill  F10Quit

I assume that the next step is be to follow https://github.com/kubernetes/community/blob/master/contributors/devel/profiling.md and instrument each of the processes in turn, to find out what they're doing when they should be idle?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 25, 2018
@Laski
Copy link

Laski commented Dec 7, 2018

/remove-lifecycle stale

This is still happening to me in v0.30.0

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 7, 2018
@timbunce
Copy link

timbunce commented Dec 8, 2018

@rcorre, or someone with permission, please edit the title to remove the reference to 1.7.

@rcorre rcorre changed the title 1.7 is CPU-hungry on minikube kubernetes is CPU-hungry on minikube Dec 8, 2018
@Laski
Copy link

Laski commented Dec 10, 2018

Is there any information I can give to help debugging this?

@erulabs
Copy link
Contributor

erulabs commented Feb 25, 2019

Can someone from sig/scalability take a look at this? I would love to help out - can test, reproduce, explore, report... This is becoming quite a blocker for local Kubernetes development!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 26, 2019
@antonmarin
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 27, 2019
@leahciMic
Copy link

leahciMic commented Jun 17, 2019

Wondering what the best way to develop locally for kubernetes? The base load and slowness of minikube is a deal breaker for rapid development especially when using a laptop.

I have thought about:

  1. Running it on a server dedicated to this. This could be a potential solution to run it on a server on my network, but it has it's own issues like being more difficult to update docker images etc, or developing from a different network, (for example, coffee shop).

  2. Running services on my machine (elasticsearch, mysql, redis, etc) and using them in develop, and then using containerized services in production. I have already discounted this idea, as I would like to have my develop environment match my production environment.

  3. Run with vm-driver=none

But all of these issues have there own problems.

I assume that with the popularity of kubernetes and minikube that there must be a way to run a local cluster without the level of resource consumption that many people are seeing.

Is there an alternative to minikube that has lower overheads?

Basically an idle mysql server on my laptop is no big deal, and I don't notice it. An idling minikube cluster (without any services running mind you), is definitely felt.

What can I do to help?

@dims
Copy link
Member

dims commented Jun 17, 2019

@leahciMic did you try kind? https://github.com/kubernetes-sigs/kind

@almike
Copy link

almike commented Aug 14, 2019

I got 30-40% constant usage with minikube, 10-20% with microk8s.
I am new to kube, not sure where to start, but I could dig some logs to help.
would be nice to see this solved as we need to run ingress locally on mac and the only way to do so is to have kube running.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2019
@cobalamin
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2019
@StefPac
Copy link

StefPac commented Nov 24, 2019

I am experiencing the same problem with minikube v1.5.2 on Darwin 10.14.6
and Kubernetes v1.16.2 on Docker '18.09.9' using Oracle VM VirtualBox Headless Interface 6.0.14

@ankurkanoria
Copy link

Similar issue, vastly hinders local development agility. Setup a new barebones single node development cluster using kubeadm (v1.17) inside a reasonably beefy vm 4 cores 4gb ram (with kvm tuning like raw disk virtio). Idle state cpu usage is around 30%, spikes to 50-60% during container creation/deletion. Controller manager and scheduler often timeout waiting for Apiserver and go to crashloopbackoff state (40+ restarts after 10 hours of ~idling). Similar issues across a range of hardware configurations on different Linux distributions.

While searching around, came across quite a few stackoverflow posts and github issues complaining of similar observations. While I understand that kube is optimized for running at scale, it does - non-trivially - hurt local development.

What will be great if there are some resources pointing us towards configuration changes that reduce resource consumption at the expense of reducing cluster reliability, e.g. health check/poll frequencies, etc. As developers working on local machines, many of us might be happy to take that tradeoff.

@nyrahul
Copy link

nyrahul commented Feb 16, 2020

Same issue (100% cpu for vboxheadless)

  • minikube v1.6.1 on Ubuntu 18.04 arch x86_64
  • Linux Kernel 5.3.0
  • K8s v1.17.0
  • vboxheadless v6.1.2

Attached is the perf report for vboxheadless (recorded with few secs while vboxheadless is at 100% CPU).
vboxheadless-perf-report.txt

@sunshine69
Copy link

Hm seeing this thread is a bit painfull. People keep posting, reporting but seems no one to care. Issues keep coming stale.

I probably run away from the whole kubeXXX all together for such behaviour.

Oh well maybe a attitude - submit a patch or shut up?

@sunshine69
Copy link

sunshine69 commented Mar 8, 2020

And if someone ask me why I have nothing to do but posting here, then you can see below. But looking above makes me even not bothering to report anything ....

Wake up people. If you do so, you lose users, your project will be less and less popular and eventually phased out ......

top - 09:48:30 up 1 day, 12:29,  1 user,  load average: 0.38, 0.79, 0.94
Tasks: 340 total,   1 running, 339 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  1.2 sy,  0.0 ni, 94.6 id,  0.8 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :  15891.9 total,   1112.2 free,   6238.8 used,   8540.9 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   8825.4 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                   
 3333 libvirt+  20   0 8822744   2.7g  20904 S     26.2  17.3   3:25.87 qemu-system-x86                           
 3474 stevek    20   0 1202132  98308  52932 S   2.3   0.6   9:46.95 Xorg                                      
 4153 stevek    20   0  975464  53264  37600 S   2.0   0.3   0:07.13 gnome-terminal-                           
32747 stevek    20   0 9003620 244924  90080 S   1.3   1.5   1:20.64 chrome                                    
 3748 stevek    20   0 3863612 283392  98096 S   1.0   1.7  14:22.97 gnome-shell                               
 5367 stevek    20   0 1756812 528440 314792 S   0.7   3.2  26:00.70 chrome                         

@tstromberg
Copy link

FWIW, minikube has vastly improved CPU overhead since v1.8 - a reduction of 19% in 2020 alone.

On a modern developer machine like a MacBook Pro, minikube with hyperkit now consumes roughly 6% of available system CPU (30-40% of a single CPU core): https://docs.google.com/spreadsheets/d/1qzgVsZ9y0zqCjoQlN_LGJH3MUMqrVPexzNhdB2jzBqU/edit#gid=1614668143

It isn't perfect, but I feel like this issue can be closed, or at least moved to the minikube repo. In 2020, we'll be focusing our eyes on reducing usage in apiserver and etcd, which is where most of minikube's CPU cycles are now spent.

@ravihugo
Copy link

ravihugo commented Jul 2, 2020

@tstromberg That is great news. I hope the apiserver/etcd usage can be lowered!

@oomichi
Copy link
Member

oomichi commented Sep 4, 2020

As #48948 (comment) it seems fine to close this issue at this time.
In addition, this issue is for minikube. It is better to open an issue on https://github.com/kubernetes/minikube/issues to get more attention from minikube experts if necessary to re-open again.

/close

@k8s-ci-robot
Copy link
Contributor

@oomichi: Closing this issue.

In response to this:

As #48948 (comment) it seems fine to close this issue at this time.
In addition, this issue is for minikube. It is better to open an issue on https://github.com/kubernetes/minikube/issues to get more attention from minikube experts if necessary to re-open again.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Pictor13
Copy link

Pictor13 commented Apr 10, 2021

It isn't perfect, but I feel like this issue can be closed, or at least moved to the minikube repo. In 2020, we'll be focusing our eyes on reducing usage in apiserver and etcd, which is where most of minikube's CPU cycles are now spent.

Thank you for your job guys!
Sorry for commenting on a closed issue @tstromberg , but I was wondering if there's a specific Kubernetes issue on the tracker that we can follow to see updates regarding these improvements to apiserver and etcd.
I tried to search on github issues, but couldn't find the right keywords 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

No branches or pull requests