Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

bodhi-one · 2021-03-19T21:38:09Z

Was this issue ever resolved? It is closed but the symptom is still occurring on Docker for Mac Desktop with Kubernetes 1.19.7 built in.

What happened:

Docker for mac desktop version 3.2.2 (61853)

With kubernetes activated in docker for mac desktop and 4 CPUs setting docker shows ~ 30 % CPU utilization when not doing work with no pods running.
With kubenetes activated in docker for mac desktop and 8 CPUs setting docker shows ~ 60-100% CPU utilization when not doing work with no pods running.
With kubernetes not activated docker shows 7% when not doing work and no pods running.

What you expected to happen:

Kubernetes to not show high CPU utilization.

How to reproduce it (as minimally and precisely as possible):

Turn on kubernetes in docker for mac desktop

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): 1.19.7
Cloud provider or hardware configuration: Mac book Pro 16 (2020)
OS (e.g: cat /etc/os-release): OSX - Big Sur
Kernel (e.g. uname -a): 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

neolit123 · 2021-03-21T18:56:56Z

high CPU utilization in Docker for mac desktop

this could be a docker for mac problem.
are you seeing the same problem on a Linux machine?

/sig node

bodhi-one · 2021-03-22T13:15:29Z

Per the original report 75565 the issue appeared to be in kubernetes perhaps in an issue with polling multiple times a second.

"Idle master constantly burns CPU/disk polling itself #75565"

"Foritus commented on Mar 25, 2019
The disk load appears to be entirely caused by etcd, presumably a second-order effect from the api server chatting to it a lot?"

This can certainly not have been closed as chatty 'by design'? At 8 CPUs allocated to Docker for Mac desktop battery life, heat, and CPU utilization are not practical.

Is is possible to work with the Docker for Mac team and utilize a new hook/switch/something that would tell etcd polling to enter a 'mobile' or CPU friendly mode and lesson whatever polling is being done here?

I do not currently have a linux laptop system to test this on.

BenTheElder · 2021-03-23T08:22:46Z

Idle master constantly burns CPU/disk polling itself #75565 was closed by the author
docker for mac is not open source and not part of the Kubernetes project, I don't recall seeing anyone working on that active here. etcd does have various tuning options that they could consider using

bodhi-one · 2021-03-23T14:45:53Z

In docker/for-mac#3065

stephen-turner commented on Jul 7, 2020 • 

I'm going to close this ticket. We are in the middle of a big investigation about Mac CPU usage (see docker/roadmap#12). However, this specific ticket is an upstream issue in Kubernetes, kubernetes/kubernetes#75565, which we cannot fix in Docker Desktop. Other single-node Kubernetes distributions (minikube, kind, etc.) suffer from the same problem. It needs to be fixed upstream. Sorry that we can't do anything about it here.

Of course, this doesn't affect the other CPU tickets we have open that are within Docker's domain. We are aware of them, accept responsibility for them, and are actively looking at them.

There does seem to be unclarity around who's issue this is. Docker for mac team is saying kubernetes. If there are tuning properties for etcd, and as this issue manifests only when kubernetes in Docker for mac is active, and the issue increases when more CPUs are allocated to Docker for mac, how can we prove whether the issues is in the kubernetes runtime or docker runtime? Docker for mac team says this is in kubernetes runtime. Are there etcd tuning parameters which Docker for mac can/should set?

What additional information can I collect to help clarify the root cause and then move forward to a resolution?
Is there a common forum/meeting that k8s team and Docker for mac team has or can create to get a solution?

lavalamp · 2021-03-23T17:06:09Z

I'll repeat my comment from the other issue:

Yes, unfortunately scaling down to small systems has not been a priority. The good news is that there's probably a ton of low hanging fruit for anyone interested it working on that.

E.g. if you just tune the lease flags on controller manager & scheduler you could probably reduce idle traffic significantly.

As far as whose "issue" this is, it's that kubernetes runtime actually is doing work even when the system is "idle" in order to make sure all components are still alive and ready etc.

Possibly you could compare the same install on different platforms to see if actually we somehow are triggering a docker for mac issue, but I kind of doubt that.

BenTheElder · 2021-03-23T18:14:58Z

I'm sorry this is probably not what you wanted to hear, but ...

There does seem to be unclarity around who's issue this is.

Kubernetes is not really responsible for the performance of any particular proprietary distros. If there are specific performance problems reported our contributors may take their time to look at them or anyone (new contributors) may attempt to contribute improvements. But a general complaint like this for a specific product is not very actionable for us.

SIG Scalability tends to work on this for large clusters, but there is not an organized group for small clusters.

Are there etcd tuning parameters which Docker for mac can/should set?

etcd is it's own project and has documentation for this.
https://etcd.io/docs/v3.4.0/tuning/

What additional information can I collect to help clarify the root cause and then move forward to a resolution?

It sounds like you've already indicated that this is an issue with etcd, which would either be a bug with etcd, or the configuration in docker for mac.

Is there a common forum/meeting that k8s team and Docker for mac team has or can create to get a solution?

There isn't really one singular "k8s team", Kubernetes is an open governance project owned by the CNCF with contributors from all over the world working independently or at many different employers. The project is organized into SIGs that manage different portions of the code, but even that is voluntary. Each SIG has meetings. https://www.kubernetes.dev/resources/community-groups/

If the docker for mac team have an actionable Kubernetes bug, they should file it.

lavalamp · 2021-03-23T18:21:11Z

Re: etcd: it's almost certainly the case that the rest of the cluster is using etcd, rather than there being a "performance issue" with it.

The root cause is almost certainly that "idle" k8s clusters are not actually idle and this is by design so that they can recover rapidly from faults; k8s is not optimized for single-system use. There are many flags that could be configured to improve idle performance, but that's not an "issue" unless our docs are insufficiently clear (which they probably are).

bodhi-one · 2021-03-31T15:06:52Z

I don't have access to kube-scheduler on docker for mac. Is it an issue that the packaging of docker for mac didn't implement/provide this OR is it buried somewhere in the k8s runtime that we can get to for tuning it?

BenTheElder · 2021-06-23T09:28:38Z

Usually kube-scheduler runs as a pod so you can see the flags, logs etc. using the normal pod tooling (kubectl get, kubectl logs, etc.)
Docker may choose to run it in a way that you cannot debug / tune it, probably docker inc should be tuning this for you.

SergeyKanzhelev · 2021-06-24T22:54:18Z

/remove-sig node
/sig api-machinery

since the issue is now with the node configuration, but rather api chattiness

/kind support
/remove-kind bug
/triage needs-information

As commented above, this issue looks like a support ticker to help tune kubernetes for a specific use case, rather than bug report. Also it lacks some information on how exactly kubernetes was installed and configured for any contributor to try repro and troubleshoot if anybody would be interested. I understand the desire to run everything on a single box for things like testing. And we do - with kind, for example. I don't recall serious issues with kind, but maybe it's because I run it on linux? Anyway, Kubernetes does not use issues on this repo for support requests. If you have a question on how to use Kubernetes or to debug a specific issue, please visit our forums.

/close

ehashman · 2021-06-24T23:31:26Z

/remove-kind bug
/close

k8s-ci-robot · 2021-06-24T23:31:31Z

@ehashman: Closing this issue.

In response to this:

/remove-kind bug
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fedebongio · 2021-07-15T20:11:28Z

/triage accepted

fedebongio · 2021-07-15T20:11:34Z

/triage accepted

SergeyKanzhelev · 2021-07-15T20:35:58Z

/triage accepted

did you mean to re-open the issue? Is there additional information on what's needed?

BenTheElder · 2021-07-15T21:49:31Z

This is just something api-machinery does to remove issues from a search for issues to triage. They do this as a group in a frequent meeting, fede's account is often the one enacting the commands but it's not fede personally.

lavalamp · 2021-07-20T21:27:36Z

What Ben said :)

since the issue is now with the node configuration, but rather api chattiness

This is just fundamentally how k8s works, there is zero chance it will get changed.

(There may still be some misconfiguration that makes a particular setup especially chatty, e.g. continually health checking something because it's not healthy.)

bodhi-one added the kind/bug Categorizes issue or PR as related to a bug. label Mar 19, 2021

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 19, 2021

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 21, 2021

bodhi-one mentioned this issue Mar 23, 2021

VM has 50% resting CPU usage when idle kubernetes/minikube#3207

Closed

alsuren mentioned this issue Jun 15, 2021

make dev redbadger/badger-brian#20

Merged

1 task

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. labels Jun 24, 2021

k8s-ci-robot removed the kind/bug Categorizes issue or PR as related to a bug. label Jun 24, 2021

k8s-ci-robot closed this as completed Jun 24, 2021

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

bodhi-one commented Mar 19, 2021 •

edited

Loading

neolit123 commented Mar 21, 2021

bodhi-one commented Mar 22, 2021 •

edited

Loading

BenTheElder commented Mar 23, 2021

bodhi-one commented Mar 23, 2021

lavalamp commented Mar 23, 2021

BenTheElder commented Mar 23, 2021

lavalamp commented Mar 23, 2021

bodhi-one commented Mar 31, 2021 •

edited

Loading

BenTheElder commented Jun 23, 2021

SergeyKanzhelev commented Jun 24, 2021

ehashman commented Jun 24, 2021

k8s-ci-robot commented Jun 24, 2021

fedebongio commented Jul 15, 2021

fedebongio commented Jul 15, 2021

SergeyKanzhelev commented Jul 15, 2021

BenTheElder commented Jul 15, 2021 •

edited

Loading

lavalamp commented Jul 20, 2021

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

Comments

bodhi-one commented Mar 19, 2021 • edited Loading

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

neolit123 commented Mar 21, 2021

bodhi-one commented Mar 22, 2021 • edited Loading

BenTheElder commented Mar 23, 2021

bodhi-one commented Mar 23, 2021

lavalamp commented Mar 23, 2021

BenTheElder commented Mar 23, 2021

lavalamp commented Mar 23, 2021

bodhi-one commented Mar 31, 2021 • edited Loading

BenTheElder commented Jun 23, 2021

SergeyKanzhelev commented Jun 24, 2021

ehashman commented Jun 24, 2021

k8s-ci-robot commented Jun 24, 2021

fedebongio commented Jul 15, 2021

fedebongio commented Jul 15, 2021

SergeyKanzhelev commented Jul 15, 2021

BenTheElder commented Jul 15, 2021 • edited Loading

lavalamp commented Jul 20, 2021

bodhi-one commented Mar 19, 2021 •

edited

Loading

bodhi-one commented Mar 22, 2021 •

edited

Loading

bodhi-one commented Mar 31, 2021 •

edited

Loading

BenTheElder commented Jul 15, 2021 •

edited

Loading