kube-proxy constantly syncing/restoring iptables rules, consuming CPU resources #1158

atombender · 2017-02-20T02:53:43Z

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Minikube version (use minikube version): 0.16.0

Environment:

OS (e.g. from /etc/os-release): macOS 10.12.3
VM Driver (e.g. cat ~/.minikube/machines/minikube/config.json | grep DriverName): xhyve
ISO version (e.g. cat ~/.minikube/machines/minikube/config.json | grep ISO): minikube-v1.0.6.iso
Install tools: -
Others: ingress addon enabled.

What happened:

localkube is consuming about 12% CPU constantly, even though no container is actively running. On host, docker-machine-driver-xhyve is consuming ~30% CPU.

Nothing much is actually running:

$ kubectl get --all-namespaces pods            
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE
kube-system   default-http-backend-27qh8       1/1       Running   0          5h
kube-system   kube-addon-manager-minikube      1/1       Running   0          5h
kube-system   kube-dns-v20-k1vv5               3/3       Running   0          5h
kube-system   kubernetes-dashboard-scg1j       1/1       Running   0          5h
kube-system   nginx-ingress-controller-mzn9l   1/1       Running   0          5h

Verified with ps that no container is using that CPU: It's all localkube. According to ps, localkube has also allocated 10GB of virtual memory.

Here is a gist with ps output, plus output from running journalctl-f for half a minute or so. It mostly shows it asking about some containers that no longer exists.

Nothing is being emitted to any container logs.

Problem re-occurs if I kill the process.

Here's an strace log (-fF -s10000 -tt). Looks like it's constantly spawning iptables, iptables-save and iptables-restore.

What you expected to happen:

I expected localkube not to use much CPU when the system idle.

How to reproduce it (as minimally and precisely as possible):

No idea. I did install Helm (the tiller controller), but I uninstalled it.

The text was updated successfully, but these errors were encountered:

eden · 2017-03-03T08:25:38Z

This is most likely due to this issue with kube-proxy. The fix hasn't yet been released in the latest non-beta version of kubernetes.

Unfortunately trying to force localkube to use the userspace proxy by passing --extra-config=proxy.Mode=userspace to minikube start does not work. That would be a temporary workaround until kube-proxy fixed.

dimpavloff · 2017-03-03T09:51:20Z

@eden kubernetes/kubernetes#26637 seems to be an issue only with multiple schedulers and controller-managers so could it be something else?

eden · 2017-03-03T10:40:03Z

@dimpavloff Initially, I thought the same thing, but after increasing the verbosity of minikube's logs to 10, I see this in localkubes logs nearly continously:

Mar 03 10:22:33 minikube localkube[3315]: I0303 10:22:33.052801    3315 proxier.go:805] Syncing iptables rules
Mar 03 10:22:33 minikube localkube[3315]: I0303 10:22:33.095894    3315 proxier.go:1103] Port "nodePort for kube-system/kubernetes-dashboard:" (:30000/tcp) was open before and is still needed
Mar 03 10:22:33 minikube localkube[3315]: I0303 10:22:33.096150    3315 proxier.go:1311] Restoring iptables rules: *filter
Mar 03 10:22:33 minikube localkube[3315]: I0303 10:22:33.103277    3315 proxier.go:798] syncProxyRules took 50.469613ms
Mar 03 10:22:33 minikube localkube[3315]: I0303 10:22:33.103304    3315 proxier.go:567] OnEndpointsUpdate took 50.53347ms for 5 endpoints

There's a lot going on in kubernetes/kubernetes#26637, but this fix in particular looks like it might help: kubernetes/kubernetes#41223

keimoon · 2017-03-24T02:51:09Z

@eden kubernetes has fixed this issue in v1.5.4, so when will minikube enable this version?

eden · 2017-03-24T06:34:38Z

You should be able to get minikube to start using 1.5.4 by starting it with the --kubernetes-version option on startup, but that version must be built and uploaded. You can see what's available here. Right now 1.5.4 doesn't look like it's in the list.

You can use 1.6, too, if you're ok with a beta, and when I did, I noticed the iptables issue is gone. However, I noticed something new that seems to cause a small amount of continuous CPU use (although anecdotally not as much). I had to switch back because I wasn't ready for some other changes that 1.6 introduced.

r2d4 · 2017-03-24T16:10:52Z

Two points

The leader election polling is because of the external hostpah provisioner we've implemented. This is going to be in all future versions, although leaderelection is a bit overkill for a single node cluster.
We can and should ship a 1.5.4 and 1.5.5. I can do that today. I'm a little reluctant to make those the default since it's going to lead to a bunch of rebasing once we merge the 1.6 branch, but we will make them available.

mezis · 2017-04-04T12:03:33Z

Confirming this issue still exists with Minikube running kubernetes 1.6.0, on a fresh install:

$ minikube version
minikube version: v0.17.1

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T23:37:53Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"dirty", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7", Compiler:"gc", Platform:"linux/amd64"}

top reports localkube using an average 4% an occasional docker usage spikes.

lefeverd · 2017-04-04T19:33:02Z

Same CPU issue, though not sure if related or not to the original one.
Always using around 20% CPU even without any other containers than the system ones running.

VM Driver: xhyve
Minikube:

$ minikube version
minikube version: v0.17.1

Kubernetes:

$ kubectl version  
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T19:15:41Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"dirty", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7", Compiler:"gc", Platform:"linux/amd64"}

Logs:
It seems kube-addon-manager-minikube is consuming a lot, I see the same error happening again and again in the logs:

2017-04-04T19:26:46.261306567Z WRN: == Failed to execute /usr/local/bin/kubectl  apply --namespace=kube-system -f /etc/kubernetes/addons     --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-04-04T19:26:46+0000. 0 tries remaining. ==
2017-04-04T19:26:51.264005597Z WRN: == Kubernetes addon reconcile completed with errors at 2017-04-04T19:26:51+0000 ==
2017-04-04T19:26:51.766514723Z error: no objects passed to create
2017-04-04T19:26:51.769507638Z INFO: == Kubernetes addon ensure completed at 2017-04-04T19:26:51+0000 ==
2017-04-04T19:27:35.408150389Z error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource

EDIT
Apparently issue above is due to the version of the addon manager not compatible with Kubernetes 1.6.0 (see kubernetes/kubernetes#43755).
I tried again with the default Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T19:15:41Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7.3", Compiler:"gc", Platform:"linux/amd64"}

No more issue in the logs, but still high CPU.

racerpeter · 2017-04-18T00:59:55Z

For what its worth, I was able to confirm that the iptables churn is no longer an issue with minikube 0.18.0, which includes kubernetes/kubernetes#41223.

I can also confirm that the fix only cut CPU usage by about half (I was hovering around 25-30% on the osx host, now down to 10-15%), but its not iptables, as evidenced by the lack of iptables-related output from strace-ing localkube.

VolCh · 2017-05-14T09:44:44Z

Fresh installed minikube & kubectl on Ubuntu 17.04 with VirtualBox 5.1.18_Ubuntur114002:
minikube version: v0.19.0

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.3", GitCommit:"0480917b552be33e2dba47386e51decb1a211df6", GitTreeState:"clean", BuildDate:"2017-05-10T15:48:59Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-05-09T23:19:49Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}`

Nothing started, nothing configured, just 'minikube start' and got ~30% CPU consuming at Core i7

francisu · 2017-06-21T21:46:34Z

Happens with these versions:

minikube version: v0.19.1

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:34:20Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-30T22:03:41Z", GoVersion:"go1.7.3", Compiler:"gc", Platform:"linux/amd64"}

Just did a minikube start and it's taking about 15-18% of my CPU.

stela · 2017-07-03T10:34:16Z

It got much worse with minikube v0.20.0 and kubernetes 1.7.0:

$ minikube start --memory=6144 --vm-driver=xhyve --kubernetes-version v1.7.0
...
$ minikube version
minikube version: v0.20.0

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-30T09:51:01Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-30T10:17:58Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

$ minikube ssh
$ top
  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND                                                                                                                           
 3208 root      20   0 10.829g 404.7m  50.7  6.8   2:30.70 S localkube

My mac's activity monitor claims docker-machine-driver-xhyve is using about 110% CPU. The numbers for top is 50% of two CPUs. docker ps on minikube only reports kubernetes system-processes, "kubectl get all" also indicates I didn't run any pods of my own.

francisu · 2017-07-03T17:46:17Z

With 0.19.1 here is a snippet from my logs:

Jul 03 17:35:03 minikube localkube[3600]: E0703 17:35:03.698261    3600 kuberuntime_image.go:106] ListImages failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:03 minikube localkube[3600]: W0703 17:35:03.698266    3600 image_gc_manager.go:176] [imageGCManager] Failed to update image list: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:03 minikube localkube[3600]: E0703 17:35:03.919380    3600 remote_runtime.go:163] ListPodSandbox with filter "nil" from runtime service failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:03 minikube localkube[3600]: E0703 17:35:03.919406    3600 kuberuntime_sandbox.go:185] ListPodSandbox failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:03 minikube localkube[3600]: E0703 17:35:03.919413    3600 generic.go:198] GenericPLEG: Unable to retrieve pods: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:04 minikube localkube[3600]: E0703 17:35:04.920018    3600 remote_runtime.go:163] ListPodSandbox with filter "nil" from runtime service failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:04 minikube localkube[3600]: E0703 17:35:04.920046    3600 kuberuntime_sandbox.go:185] ListPodSandbox failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:04 minikube localkube[3600]: E0703 17:35:04.920054    3600 generic.go:198] GenericPLEG: Unable to retrieve pods: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:05 minikube localkube[3600]: E0703 17:35:05.920799    3600 remote_runtime.go:163] ListPodSandbox with filter "nil" from runtime service failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:05 minikube localkube[3600]: E0703 17:35:05.920826    3600 kuberuntime_sandbox.go:185] ListPodSandbox failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:05 minikube localkube[3600]: E0703 17:35:05.920834    3600 generic.go:198] GenericPLEG: Unable to retrieve pods: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:05 minikube localkube[3600]: E0703 17:35:05.928964    3600 kubelet.go:2079] Container runtime not ready: RuntimeReady=false reason:DockerDaemonNotReady message:docker: failed to get docker version: Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:06 minikube localkube[3600]: I0703 17:35:06.265710    3600 kubelet.go:1752] skipping pod synchronization - [container runtime is down]
Jul 03 17:35:06 minikube localkube[3600]: E0703 17:35:06.922085    3600 remote_runtime.go:163] ListPodSandbox with filter "nil" from runtime service failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:06 minikube localkube[3600]: E0703 17:35:06.922114    3600 kuberuntime_sandbox.go:185] ListPodSandbox failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:06 minikube localkube[3600]: E0703 17:35:06.922123    3600 generic.go:198] GenericPLEG: Unable to retrieve pods: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:07 minikube localkube[3600]: E0703 17:35:07.922878    3600 remote_runtime.go:163] ListPodSandbox with filter "nil" from runtime service failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:07 minikube localkube[3600]: E0703 17:35:07.922908    3600 kuberuntime_sandbox.go:185] ListPodSandbox failed: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Jul 03 17:35:07 minikube localkube[3600]: E0703 17:35:07.922917    3600 generic.go:198] GenericPLEG: Unable to retrieve pods: rpc error: code = 2 desc = Cannot connect to the Docker daemon. Is the docker daemon running on this host?

wstrange · 2017-07-05T14:09:53Z

Also seeing very high system CPU time on 1.7.0 / MacOS / VirtualBox. With 2 CPUs allocated, each is at 50% System CPU. VBox on the mac is chewing up 150% CPU.

zacheryph · 2017-07-05T19:22:59Z

MacOS / VBox / Kube 1.6.4. minikube start -v 3. Kube 1.7 seems to bring the iptables overhead back (and a LOT more cpu usage.) 1.6.4 Only shows the below repeatedly. I see eviction manager output every once and a while between these as well.

Jul 05 19:14:02 minikube localkube[3498]: I0705 19:14:02.157508    3498 config.go:95] Calling handler.OnEndpointsUpdate()
Jul 05 19:14:02 minikube localkube[3498]: I0705 19:14:02.157562    3498 healthcheck.go:223] Not saving endpoints for unknown healthcheck "kube-system/kubernetes-dashboard"
Jul 05 19:14:02 minikube localkube[3498]: I0705 19:14:02.157576    3498 healthcheck.go:223] Not saving endpoints for unknown healthcheck "kube-system/kube-dns"
Jul 05 19:14:03 minikube localkube[3498]: I0705 19:14:03.083730    3498 wrap.go:75] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (988.922µs) 200 [[localkube/v1.6.4 (linux/amd64) kubernetes/$Format/leader-election] 127.0.0.1:32964]
Jul 05 19:14:03 minikube localkube[3498]: I0705 19:14:03.087432    3498 wrap.go:75] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (2.64199ms) 200 [[localkube/v1.6.4 (linux/amd64) kubernetes/$Format/leader-election] 127.0.0.1:32964]
Jul 05 19:14:03 minikube localkube[3498]: I0705 19:14:03.088801    3498 config.go:95] Calling handler.OnEndpointsUpdate()
Jul 05 19:14:03 minikube localkube[3498]: I0705 19:14:03.089178    3498 healthcheck.go:223] Not saving endpoints for unknown healthcheck "kube-system/kube-dns"
Jul 05 19:14:03 minikube localkube[3498]: I0705 19:14:03.089329    3498 healthcheck.go:223] Not saving endpoints for unknown healthcheck "kube-system/kubernetes-dashboard"
Jul 05 19:14:04 minikube localkube[3498]: I0705 19:14:04.161904    3498 wrap.go:75] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (1.45693ms) 200 [[localkube/v1.6.4 (linux/amd64) kubernetes/$Format/leader-election] 127.0.0.1:32964]
Jul 05 19:14:04 minikube localkube[3498]: I0705 19:14:04.168189    3498 wrap.go:75] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (5.547657ms) 200 [[localkube/v1.6.4 (linux/amd64) kubernetes/$Format/leader-election] 127.0.0.1:32964]
Jul 05 19:14:04 minikube localkube[3498]: I0705 19:14:04.168948    3498 config.go:95] Calling handler.OnEndpointsUpdate()

zacheryph · 2017-07-05T19:35:50Z

MacOS / VBox / Kube 1.7.0. minikube --kubernetes-version v1.7.0 -v 3. The part that caught my eye in v1.7.0 is the following output. It seems the runner has a [min] sync period of 0. Maybe this got fixed and is a regression in 1.7? I tried tracking down how this gets set / can be set. It appears to be settable via --extra-config but I couldn't figure out how to set it.

Jul 05 19:26:56 minikube localkube[3483]: I0705 19:26:56.190243    3483 proxier.go:991] Syncing iptables rules
Jul 05 19:26:56 minikube localkube[3483]: I0705 19:26:56.417815    3483 bounded_frequency_runner.go:221] sync-runner: ran, next possible in 0s, periodic in 0s

zacheryph · 2017-07-05T20:18:28Z

I don't think this makes much of a difference either but I see the same results with qemu on Debian 9 as well. Just wanted to give an idea that this appears to be minikube/localkube related and not MacOS related.

stela · 2017-07-06T09:32:38Z

@zacheryph great research :) There is a KubeProxyIPTablesConfiguration.MinSyncPeriod setting, see https://godoc.org/k8s.io/kubernetes/pkg/apis/componentconfig#KubeProxyIPTablesConfiguration

// minSyncPeriod is the minimum period that iptables rules are refreshed (e.g. '5s', '1m',
// '2h22m').
MinSyncPeriod metav1.Duration

When I type dmesg kernel logs showed a lot of repeated networking-related log entries, so guess it makes sense. I didn't find the location of any logs though?

The KubeProxyIPTablesConfiguration struct is used in the KubeProxyConfiguration struct, as a field named IPTables. To set KubeProxyConfiguration, the key should start with proxy
The way to set it would then be, as far as I can understand

--extra-config=proxy.IPTables.SyncPeriod=1m --extra-config=proxy.IPTables.MinSyncPeriod=1m

in order to do the check every minute instead of continuously. However the CPU usage stays high even with that extra option for me :( I do see the option being passed on to localkube though.

I copied an strace binary into the minikube VM, running it against the localkube process and using the -f flag shows it's continuously forking off iptables. Examples:

[pid 19500] execve("/sbin/iptables", ["iptables", "-w2", "-N", "KUBE-SERVICES", "-t", "filter"], [/* 12 vars */] <unfinished ...>
[pid 19501] execve("/sbin/iptables", ["iptables", "-w2", "-N", "KUBE-SERVICES", "-t", "nat"], [/* 12 vars */] <unfinished ...>
[pid 19502] execve("/sbin/iptables", ["iptables", "-w2", "-C", "INPUT", "-t", "filter", "-m", "comment", "--comment", "kubernetes service portals", "-j", "KUBE-SERVICES"], [/* 12 vars */] <unfinished ...>
[pid 19503] execve("/sbin/iptables", ["iptables", "-w2", "-C", "OUTPUT", "-t", "filter", "-m", "comment", "--comment", "kubernetes service portals", "-j", "KUBE-SERVICES"], [/* 12 vars */] <unfinished ...>
[pid 19504] execve("/sbin/iptables", ["iptables", "-w2", "-C", "OUTPUT", "-t", "nat", "-m", "comment", "--comment", "kubernetes service portals", "-j", "KUBE-SERVICES"], [/* 12 vars */] <unfinished ...>
...
[pid 19508] execve("/sbin/iptables-save", ["iptables-save", "-t", "filter"], [/* 12 vars */] <unfinished ...>
...
[pid 19510] execve("/sbin/iptables-restore", ["iptables-restore", "--noflush", "--counters"], [/* 12 vars */] <unfinished ...>

It also performs quite a bit of HTTP GET requests various APIs:

[pid 26767] write(112, "GET /v2/keys/registry/services/e"..., 206 <unfinished ...>
[pid 26828] write(168, "GET /v1.23/containers/json?all=1"..., 228) = 228
[pid 26834] write(168, "GET /v1.23/containers/json?all=1"..., 185 <unfinished ...>
...
[pid 21278] write(262, "GET /v2/keys/registry/services/e"..., 168 <unfinished ...>

stela · 2017-07-06T11:08:50Z

I tried to set most of the duration-settings described at https://godoc.org/k8s.io/kubernetes/pkg/apis/componentconfig to 1 minute, but no success in reducing CPU usage.

oliverbestmann · 2017-07-14T08:54:38Z

--extra-config=proxy.IPTables.SyncPeriod=1m --extra-config=proxy.IPTables.MinSyncPeriod=1m
This did not work for me either, localkube was complaining that it was not able to parse 1m as an integer.

Setting the options to 5 seconds by specifying it in nanoseconds helped:

  --extra-config=proxy.IPTables.SyncPeriod.Duration=5000000000 \
  --extra-config=proxy.IPTables.MinSyncPeriod.Duration=3000000000

stela · 2017-07-14T11:57:32Z

@oliverbestmann huge thanks! That one worked for me too, the laptop fan is now quiet again. According to https://kubernetes.io/docs/admin/kube-proxy/ the default SyncPeriod is meant to be 30 seconds, so set it to that in nanoseconds. Localcube now consumes "just" 14% according to "ps uaxS" (S includes the child processes), similar to the original bug description.

r2d4 · 2017-07-14T15:54:20Z

I'll set it to the upstream defaults. This regression probably happened because of the new way of configuring kube-proxy, there is no longer a default config struct that we can init. I'll make sure the other defaults are properly set also.

Set some kube-proxy defaults that got unset through the new way of configuring kube-proxy. Add more delay to the ip tables syncing reduces idle CPU load a lot. See kubernetes#1158 (comment)

r2d4 · 2017-07-14T18:13:38Z

I've sent #1699 to make those kube-proxy options the default for minikube.

Once that is merged, I would like to set some benchmarks that we would like to hit to make this issue more concrete.

Something like

With default options:

idling, localkube consumes less than 15% of CPU and Memory on average

I'm not sure exactly how we could set limits on the driver binary or minikube, since those will probably fluctuate a lot more running on different platforms.

stela · 2017-07-18T07:27:53Z

@r2d4 Nice that #1699 fixes the defaults, but since using values like "1m" didn't work for overriding the Durations but raw nanoseconds did, is there a time-unit parsing- or documentation issue as well?

r2d4 · 2017-07-18T16:00:54Z

@stela I've added an issue here #1712. It should be a relatively simple fix

stela · 2017-07-21T07:26:22Z

@r2d4 Thanks!

daveoconnor · 2017-07-24T20:43:16Z

This workaround doesn't seem to work for me:

minikube start --kubernetes-version v1.7.0 --vm-driver xhyve --extra-config=proxy.IPTables.SyncPeriod.Duration=5000000000 --extra-config=proxy.IPTables.MinSyncPeriod.Duration=3000000000

Does this need to be done with more recent code than v0.20.0?

I'm still seeing ~60-80%, ~30%, ~30% CPU usage on the 3 VBoxHeadless processes I have.

alanbrent · 2017-07-25T02:56:16Z

@daveoconnor I'm not sure, but perhaps this requires a newer ISO than the current release uses by default. This ~~fix~~ workaround works for me, and I'm using a newer ISO:

$ minikube config view | grep iso-url
- iso-url: https://storage.googleapis.com/minikube/iso/minikube-v0.22.0.iso

daveoconnor · 2017-07-25T03:27:56Z

@alanbrent Thanks for the response.

$ minikube config view gives me nothing back.

I tried using

$ minikube start --kubernetes-version v1.7.0\ 
--iso-url=https://storage.googleapis.com/minikube/iso/minikube-v0.22.0.iso \
--vm-driver xhyve \
--extra-config=proxy.IPTables.SyncPeriod.Duration=5000000000 \
--extra-config=proxy.IPTables.MinSyncPeriod.Duration=3000000000 \

As per the documentation at https://kubernetes.io/docs/getting-started-guides/minikube/#using-rkt-container-engine and the output I got was:

Starting local Kubernetes v1.7.0 cluster...
Starting VM...
Moving files into cluster...Slightly lower
Setting up certs...
Starting cluster components...
Connecting to cluster...
Setting up kubeconfig...
Kubectl is now configured to use the cluster.

No message of downloading a new ISO so I'm guessing it's already using that.

After trying

$ minikube config set iso-url https://storage.googleapis.com/minikube/iso/minikube-v0.22.0.iso
$ minikube start --kubernetes-version v1.7.0 \
--vm-driver xhyve \
--extra-config=proxy.IPTables.SyncPeriod.Duration=5000000000 \
--extra-config=proxy.IPTables.MinSyncPeriod.Duration=3000000000

The startup output was the same, and I can't be sure the CPU load is any different. Not spiking quite as high but I'm not sure that's not a coincidence.

stela · 2017-07-25T09:28:02Z

@daveoconnor I think at least that I had to run minikube delete before any new settings from minikube start would take effect? (beware, this will of course wipe your kubernetes installation with containers and all)

daveoconnor · 2017-07-25T18:54:17Z

@stela Thanks, good idea. I'm not seeing much change. Here's how that went:

EDIT: removed xhyve driver section because I realised I shouldn't be using it. Sorry if that caused confusion.

$ minikube delete
$ minikube start --kubernetes-version v1.7.0 \
 --extra-config=proxy.IPTables.SyncPeriod.Duration=5000000000 \
 --extra-config=proxy.IPTables.MinSyncPeriod.Duration=3000000000 \
Starting local Kubernetes v1.7.0 cluster...
Starting VM...
Moving files into cluster...
Setting up certs...
Starting cluster components...
Connecting to cluster...
Setting up kubeconfig...
Kubectl is now configured to use the cluster.

CPU processes still at 40-70%, 20-30%, 20-30%.

philipn · 2017-08-14T09:14:15Z

FWIW, I am seeing significantly lower (from ~100% CPU utilization down to 20%) with minikube v0.21.0 (and no custom settings). Killing all of the localkube and k8s services brings VirtualBox down to 8% or so.

atombender · 2017-10-03T22:29:18Z

It's been 8 months, anything happening with this?

I'm not seeing any difference with the above flags (I've confirmed that localkube is using them) on Minikube 0.22.2, Kubernetes 1.7.5. localkube is still using 6-12% CPU on the VM, and docker-machine-driver-xhyve is using about 20% on the host. As a result, the fan on my machine runs constantly, and it's just not a very pleasant developer experience.

oliverbestmann · 2017-10-04T08:25:56Z

Using around 15 to 20 percent CPU is after those flags was applied as a default. A little bit of tracing the minikube binary reveals most of the time spend in cgroup/cadvisor stats for getting the CPU usage of the containers and for querying various APIs internally by localkube or one of its components.

alanbrent · 2017-10-05T18:01:23Z

I'm not sure if this is a workable solution for everyone, but I've solved this problem by switching to the kubeadm bootstrapper. You can do so by either invoking the cli param when you start: minikube start --bootstrapper=kubeadm, or setting the configuration parameter globally: minikube config set bootstrapper kubeadm.

Please note that in order for this to be effective (I believe) you'll need to minikube delete first.

eden · 2017-10-19T17:32:26Z

@alanbrent just tried this with minikube 0.22.3 and the vm is still consuming around 18%-20% on the host. There's no localkube to blame anymore, but I see various kubernetes components collectively consuming around 5-20% cpu on the vm itself.

It does not appear that kubeadm is helping things here.

fejta-bot · 2018-01-17T17:57:27Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

eden · 2018-01-17T23:31:15Z

/remove-lifecycle stale

koniiiik · 2018-04-10T10:54:12Z

With minikube v0.25.2, running kubernetes v1.9.4, I still see localkube being the top CPU hog, oscillaiting between about 40 and 90% CPU usage inside the VM; this is on Linux with Virtualbox, with an i7-8550U. Never mind, please disregard this, the machine was somehow stuck in the lowest-power state at just 400 MHz CPU clock speeds; with proper CPU frequency scaling it remains at much more reasonable levels around 2-5%. Nevertheless, it maintains CPU usage at levels just high enough to not allow the CPU to enter low-power states, which just drains laptop batteries without doing anything; OTOH, I get that k8s is designed for DC environments rather than laptops…

Attaching a strace of localkube (localkube.strace.gz); looks like it's constantly polling stuff inside /sys/fs/cgroup/; and a fragment of the journal from the VM (minikube-journal.gz) where I don't really see anything interesting…

cmbernard333 · 2018-05-24T16:40:30Z

I am affected by this issue as well. Running minikube using virtualbox driver consistently drivers the CPU well over 100%.

hsyed · 2018-06-26T14:13:31Z

So I'm looking at 30% thrashing of hyperkit at idle state on a Mac Pro.

When I minikube ssh into the container and use top the control plane components all stay below 2% -- I don't know how accurate the cpu time slicing information is inside a hyperkit vm but It makes me think that the hyperkit VM should remain below 10%. gut feel says 4%.

I've been using the mac "Activity Monitor" when judging the behaviour of the hyperkit process. Using htop paints a different picture, this gives what I expect the timing reports to be with the cpu idling on average at 2%.

When I was doing systems programming "Activity Monitor" was off about the virtual memory metrics -- so I can completely believe it is just flat out wrong. Could one of the systems programmers here shed some light on the discrepancies between "Activity monitor" and htop ? Which one do we trust ?

A simple guide on correctly profiling and interpreting the vm process and the processes inside the vm would help a lot in diagnosing and providing feedback on such issues.

hsyed · 2018-06-26T15:36:24Z

disregard the last one, sudo htop is in line with activity monitor :(

rafalrusin · 2018-08-18T17:07:57Z

Same here. Using Ubuntu, minikube v0.25.2, kubernetes 1.7.5, VirtualBox. It's idling at 40% CPU with nothing installed.

tstromberg · 2018-09-20T14:48:18Z

The iptables issues were fixed long ago. Please re-open other performance issues as new bugs.

r2d4 added the kind/bug Categorizes issue or PR as related to a bug. label Feb 21, 2017

r2d4 self-assigned this Jul 14, 2017

r2d4 mentioned this issue Jul 14, 2017

Restore some kube-proxy defaults #1699

Merged

r2d4 mentioned this issue Jul 18, 2017

Support v1.Duration as a configuration type #1712

Closed

timbunce mentioned this issue Nov 22, 2017

kubernetes is CPU-hungry on minikube kubernetes/kubernetes#48948

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2018

hsyed mentioned this issue Jun 21, 2018

Kubernetes api server idling cpu usage. docker/for-mac#3023

Closed

r2d4 removed their assignment Jul 31, 2018

djs55 mentioned this issue Aug 13, 2018

High CPU Utilization of Hyperkit in Mac docker/for-mac#1759

Closed

tstromberg added co/xhyve bootstrapper/localkube area/performance Performance related issues labels Sep 19, 2018

tstromberg changed the title ~~localkube consumes CPU when system is "idle"~~ kube-proxy constantly syncing/restoring iptables rules, consuming CPU resources Sep 20, 2018

tstromberg closed this as completed Sep 20, 2018

kube-proxy constantly syncing/restoring iptables rules, consuming CPU resources #1158

kube-proxy constantly syncing/restoring iptables rules, consuming CPU resources #1158

Comments

atombender commented Feb 20, 2017 • edited Loading

eden commented Mar 3, 2017 • edited Loading

dimpavloff commented Mar 3, 2017

eden commented Mar 3, 2017 • edited Loading

keimoon commented Mar 24, 2017

eden commented Mar 24, 2017 • edited Loading

r2d4 commented Mar 24, 2017

mezis commented Apr 4, 2017

lefeverd commented Apr 4, 2017 • edited Loading

racerpeter commented Apr 18, 2017

VolCh commented May 14, 2017

francisu commented Jun 21, 2017

stela commented Jul 3, 2017

francisu commented Jul 3, 2017 • edited Loading

wstrange commented Jul 5, 2017

zacheryph commented Jul 5, 2017

zacheryph commented Jul 5, 2017

zacheryph commented Jul 5, 2017

stela commented Jul 6, 2017 • edited Loading

stela commented Jul 6, 2017

oliverbestmann commented Jul 14, 2017 • edited Loading

stela commented Jul 14, 2017

r2d4 commented Jul 14, 2017

r2d4 commented Jul 14, 2017

stela commented Jul 18, 2017

r2d4 commented Jul 18, 2017

stela commented Jul 21, 2017

daveoconnor commented Jul 24, 2017

alanbrent commented Jul 25, 2017 • edited Loading

daveoconnor commented Jul 25, 2017 • edited Loading

stela commented Jul 25, 2017

daveoconnor commented Jul 25, 2017 • edited Loading

philipn commented Aug 14, 2017

atombender commented Oct 3, 2017

oliverbestmann commented Oct 4, 2017

alanbrent commented Oct 5, 2017 • edited Loading

eden commented Oct 19, 2017

fejta-bot commented Jan 17, 2018

eden commented Jan 17, 2018

koniiiik commented Apr 10, 2018 • edited Loading

cmbernard333 commented May 24, 2018

hsyed commented Jun 26, 2018 • edited Loading

hsyed commented Jun 26, 2018

rafalrusin commented Aug 18, 2018

tstromberg commented Sep 20, 2018

atombender commented Feb 20, 2017 •

edited

Loading

eden commented Mar 3, 2017 •

edited

Loading

eden commented Mar 3, 2017 •

edited

Loading

eden commented Mar 24, 2017 •

edited

Loading

lefeverd commented Apr 4, 2017 •

edited

Loading

francisu commented Jul 3, 2017 •

edited

Loading

stela commented Jul 6, 2017 •

edited

Loading

oliverbestmann commented Jul 14, 2017 •

edited

Loading

alanbrent commented Jul 25, 2017 •

edited

Loading

daveoconnor commented Jul 25, 2017 •

edited

Loading

daveoconnor commented Jul 25, 2017 •

edited

Loading

alanbrent commented Oct 5, 2017 •

edited

Loading

koniiiik commented Apr 10, 2018 •

edited

Loading

hsyed commented Jun 26, 2018 •

edited

Loading