Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

Closed
bodhi-one opened this issue Mar 19, 2021 · 17 comments
Closed

Reopen for issue 75565 (high CPU utilization in Docker for mac desktop) #100415

bodhi-one opened this issue Mar 19, 2021 · 17 comments
Labels
kind/support Categorizes issue or PR as a support question. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@bodhi-one
Copy link

bodhi-one commented Mar 19, 2021

#75565

Was this issue ever resolved? It is closed but the symptom is still occurring on Docker for Mac Desktop with Kubernetes 1.19.7 built in.

What happened:

Docker for mac desktop version 3.2.2 (61853)

With kubernetes activated in docker for mac desktop and 4 CPUs setting docker shows ~ 30 % CPU utilization when not doing work with no pods running.
With kubenetes activated in docker for mac desktop and 8 CPUs setting docker shows ~ 60-100% CPU utilization when not doing work with no pods running.
With kubernetes not activated docker shows 7% when not doing work and no pods running.

What you expected to happen:

Kubernetes to not show high CPU utilization.

How to reproduce it (as minimally and precisely as possible):

Turn on kubernetes in docker for mac desktop

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.19.7
  • Cloud provider or hardware configuration: Mac book Pro 16 (2020)
  • OS (e.g: cat /etc/os-release): OSX - Big Sur
  • Kernel (e.g. uname -a): 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@bodhi-one bodhi-one added the kind/bug Categorizes issue or PR as related to a bug. label Mar 19, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 19, 2021
@neolit123
Copy link
Member

high CPU utilization in Docker for mac desktop

this could be a docker for mac problem.
are you seeing the same problem on a Linux machine?

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 21, 2021
@bodhi-one
Copy link
Author

bodhi-one commented Mar 22, 2021

Per the original report 75565 the issue appeared to be in kubernetes perhaps in an issue with polling multiple times a second.

"Idle master constantly burns CPU/disk polling itself #75565"

"Foritus commented on Mar 25, 2019
The disk load appears to be entirely caused by etcd, presumably a second-order effect from the api server chatting to it a lot?"

This can certainly not have been closed as chatty 'by design'? At 8 CPUs allocated to Docker for Mac desktop battery life, heat, and CPU utilization are not practical.

Is is possible to work with the Docker for Mac team and utilize a new hook/switch/something that would tell etcd polling to enter a 'mobile' or CPU friendly mode and lesson whatever polling is being done here?

I do not currently have a linux laptop system to test this on.

@BenTheElder
Copy link
Member

@bodhi-one
Copy link
Author

In docker/for-mac#3065

stephen-turner commented on Jul 7, 2020 • 

I'm going to close this ticket. We are in the middle of a big investigation about Mac CPU usage (see docker/roadmap#12). However, this specific ticket is an upstream issue in Kubernetes, kubernetes/kubernetes#75565, which we cannot fix in Docker Desktop. Other single-node Kubernetes distributions (minikube, kind, etc.) suffer from the same problem. It needs to be fixed upstream. Sorry that we can't do anything about it here.

Of course, this doesn't affect the other CPU tickets we have open that are within Docker's domain. We are aware of them, accept responsibility for them, and are actively looking at them.

There does seem to be unclarity around who's issue this is. Docker for mac team is saying kubernetes. If there are tuning properties for etcd, and as this issue manifests only when kubernetes in Docker for mac is active, and the issue increases when more CPUs are allocated to Docker for mac, how can we prove whether the issues is in the kubernetes runtime or docker runtime? Docker for mac team says this is in kubernetes runtime. Are there etcd tuning parameters which Docker for mac can/should set?

What additional information can I collect to help clarify the root cause and then move forward to a resolution?
Is there a common forum/meeting that k8s team and Docker for mac team has or can create to get a solution?

@lavalamp
Copy link
Member

I'll repeat my comment from the other issue:

Yes, unfortunately scaling down to small systems has not been a priority. The good news is that there's probably a ton of low hanging fruit for anyone interested it working on that.

E.g. if you just tune the lease flags on controller manager & scheduler you could probably reduce idle traffic significantly.

As far as whose "issue" this is, it's that kubernetes runtime actually is doing work even when the system is "idle" in order to make sure all components are still alive and ready etc.

Possibly you could compare the same install on different platforms to see if actually we somehow are triggering a docker for mac issue, but I kind of doubt that.

@BenTheElder
Copy link
Member

I'm sorry this is probably not what you wanted to hear, but ...

There does seem to be unclarity around who's issue this is.

Kubernetes is not really responsible for the performance of any particular proprietary distros. If there are specific performance problems reported our contributors may take their time to look at them or anyone (new contributors) may attempt to contribute improvements. But a general complaint like this for a specific product is not very actionable for us.

SIG Scalability tends to work on this for large clusters, but there is not an organized group for small clusters.

Are there etcd tuning parameters which Docker for mac can/should set?

etcd is it's own project and has documentation for this.
https://etcd.io/docs/v3.4.0/tuning/

What additional information can I collect to help clarify the root cause and then move forward to a resolution?

It sounds like you've already indicated that this is an issue with etcd, which would either be a bug with etcd, or the configuration in docker for mac.

Is there a common forum/meeting that k8s team and Docker for mac team has or can create to get a solution?

There isn't really one singular "k8s team", Kubernetes is an open governance project owned by the CNCF with contributors from all over the world working independently or at many different employers. The project is organized into SIGs that manage different portions of the code, but even that is voluntary. Each SIG has meetings. https://www.kubernetes.dev/resources/community-groups/

If the docker for mac team have an actionable Kubernetes bug, they should file it.

@lavalamp
Copy link
Member

Re: etcd: it's almost certainly the case that the rest of the cluster is using etcd, rather than there being a "performance issue" with it.

The root cause is almost certainly that "idle" k8s clusters are not actually idle and this is by design so that they can recover rapidly from faults; k8s is not optimized for single-system use. There are many flags that could be configured to improve idle performance, but that's not an "issue" unless our docs are insufficiently clear (which they probably are).

@bodhi-one
Copy link
Author

bodhi-one commented Mar 31, 2021

I don't have access to kube-scheduler on docker for mac. Is it an issue that the packaging of docker for mac didn't implement/provide this OR is it buried somewhere in the k8s runtime that we can get to for tuning it?

@BenTheElder
Copy link
Member

Usually kube-scheduler runs as a pod so you can see the flags, logs etc. using the normal pod tooling (kubectl get, kubectl logs, etc.)
Docker may choose to run it in a way that you cannot debug / tune it, probably docker inc should be tuning this for you.

@SergeyKanzhelev
Copy link
Member

/remove-sig node
/sig api-machinery

since the issue is now with the node configuration, but rather api chattiness

/kind support
/remove-kind bug
/triage needs-information

As commented above, this issue looks like a support ticker to help tune kubernetes for a specific use case, rather than bug report. Also it lacks some information on how exactly kubernetes was installed and configured for any contributor to try repro and troubleshoot if anybody would be interested. I understand the desire to run everything on a single box for things like testing. And we do - with kind, for example. I don't recall serious issues with kind, but maybe it's because I run it on linux? Anyway, Kubernetes does not use issues on this repo for support requests. If you have a question on how to use Kubernetes or to debug a specific issue, please visit our forums.

/close

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. labels Jun 24, 2021
@ehashman
Copy link
Member

/remove-kind bug
/close

@k8s-ci-robot k8s-ci-robot removed the kind/bug Categorizes issue or PR as related to a bug. label Jun 24, 2021
@k8s-ci-robot
Copy link
Contributor

@ehashman: Closing this issue.

In response to this:

/remove-kind bug
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 15, 2021
@fedebongio
Copy link
Contributor

/triage accepted

@SergeyKanzhelev
Copy link
Member

/triage accepted

did you mean to re-open the issue? Is there additional information on what's needed?

@BenTheElder
Copy link
Member

BenTheElder commented Jul 15, 2021

This is just something api-machinery does to remove issues from a search for issues to triage. They do this as a group in a frequent meeting, fede's account is often the one enacting the commands but it's not fede personally.

@lavalamp
Copy link
Member

What Ben said :)

since the issue is now with the node configuration, but rather api chattiness

This is just fundamentally how k8s works, there is zero chance it will get changed.

(There may still be some misconfiguration that makes a particular setup especially chatty, e.g. continually health checking something because it's not healthy.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

8 participants