memory leak in kubelet 1.12.5 #73587

szuecs · 2019-01-31T12:01:51Z

What happened:
After upgrading to kubernetes 1.12.5 we observe failing nodes, that are caused by kubelet eating all over the memory after some time.

I use image k8s.gcr.io/hyperkube:v1.12.5 to run kubelet on 102 clusters and since a week we see some nodes leaking memory, caused by kubelet.

I investigated some of these kubelets with strace and pprof.
With 3s of running strace I saw >= 50 openat() calls to the same file from the same threadid (pid) from kubelet:

    50 [pid 12416] openat(10, "__pycache__", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 8
    54 [pid 12368] openat(6, "__pycache__", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 4
    72 [pid 12368] openat(8, "__pycache__", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 4
    80 [pid 12368] openat(4, "__pycache__", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 3
    85 [pid 12416] openat(8, "__pycache__", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 6

If I do pprof kubelet it shows client_go metrics and compress is taking over most of the compute time.

(pprof) root@ip-172-31-10-50:~# go tool pprof  http://localhost:10248/debug/pprof/heap
Fetching profile from http://localhost:10248/debug/pprof/heap
Saved profile in /root/pprof/pprof.hyperkube.localhost:10248.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top
2406.93MB of 2451.55MB total (98.18%)
Dropped 2863 nodes (cum <= 12.26MB)
Showing top 10 nodes out of 34 (cum >= 2411.39MB)
      flat  flat%   sum%        cum   cum%
 2082.07MB 84.93% 84.93%  2082.07MB 84.93%  k8s.io/kubernetes/vendor/github.com/beorn7/perks/quantile.newStream (inline)
  311.65MB 12.71% 97.64%  2398.72MB 97.84%  k8s.io/kubernetes/vendor/github.com/prometheus/client_golang/prometheus.newSummary
   10.71MB  0.44% 98.08%  2414.43MB 98.49%  k8s.io/kubernetes/vendor/github.com/prometheus/client_golang/prometheus.(*MetricVec).getOrCreateMetricWithLabelValues
    2.50MB   0.1% 98.18%  2084.57MB 85.03%  k8s.io/kubernetes/vendor/github.com/beorn7/perks/quantile.NewTargeted
         0     0% 98.18%  2412.06MB 98.39%  k8s.io/kubernetes/cmd/kubelet/app.startKubelet.func1
         0     0% 98.18%  2412.06MB 98.39%  k8s.io/kubernetes/pkg/kubelet.(*Kubelet).HandlePodAdditions
         0     0% 98.18%  2412.06MB 98.39%  k8s.io/kubernetes/pkg/kubelet.(*Kubelet).Run
         0     0% 98.18%  2412.06MB 98.39%  k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncLoop
         0     0% 98.18%  2412.06MB 98.39%  k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncLoopIteration
         0     0% 98.18%  2411.39MB 98.36%  k8s.io/kubernetes/pkg/kubelet/pod.(*basicManager).AddPod

Memory profile png:

The reflector metrics seem to be the problem:

curl -s localhost:10255/metrics | sed 's/{.*//' | sort | uniq -c | sort -nr
  25749 reflector_watch_duration_seconds
  25749 reflector_list_duration_seconds
  25749 reflector_items_per_watch
  25749 reflector_items_per_list
   8583 reflector_watches_total
   8583 reflector_watch_duration_seconds_sum
   8583 reflector_watch_duration_seconds_count
   8583 reflector_short_watches_total
   8583 reflector_lists_total
   8583 reflector_list_duration_seconds_sum
   8583 reflector_list_duration_seconds_count
   8583 reflector_last_resource_version
   8583 reflector_items_per_watch_sum
   8583 reflector_items_per_watch_count
   8583 reflector_items_per_list_sum
   8583 reflector_items_per_list_count
    165 storage_operation_duration_seconds_bucket
     51 kubelet_runtime_operations_latency_microseconds
     44 rest_client_request_latency_seconds_bucket
     33 kubelet_docker_operations_latency_microseconds
     17 kubelet_runtime_operations_latency_microseconds_sum
     17 kubelet_runtime_operations_latency_microseconds_count
     17 kubelet_runtime_operations

What you expected to happen:

I expect that kubelet does not need so much memory

How to reproduce it (as minimally and precisely as possible):

I don't know

Anything else we need to know?:

One of the affected clusters has only 120 Pods and 3 Pods are in CrashLoopBackOff state, one pod since 6 days on an affected node which was investigated.

Environment:

Kubernetes version (use kubectl version): v1.12.5
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release):

# cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1800.7.0
VERSION_ID=1800.7.0
BUILD_ID=2018-08-15-2254
PRETTY_NAME="Container Linux by CoreOS 1800.7.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Kernel (e.g. uname -a): Linux ip-172-31-10-50.eu-central-1.compute.internal 4.14.63-coreos Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Wed Aug 15 22:26:16 UTC 2018 x86_64 Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/Linux
Install tools: custom https://github.com/zalando-incubator/kubernetes-on-aws

The text was updated successfully, but these errors were encountered:

szuecs · 2019-01-31T12:06:32Z

/sig node

Shnatsel · 2019-01-31T12:17:51Z

@kubernetes/sig-node-bugs

k8s-ci-robot · 2019-01-31T12:17:58Z

@Shnatsel: Reiterating the mentions to trigger a notification:
@kubernetes/sig-node-bugs

In response to this:

@kubernetes/sig-node-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

aermakov-zalando · 2019-01-31T12:57:12Z

I think it might be a side effect of #71731, but I couldn't find any numbers in there. The comments mention an offline discussion, was this brought up?

aermakov-zalando · 2019-01-31T12:59:04Z

What happens (I only investigated it very briefly) is that kubelet creates a bunch of reflectors for every pod — secret mounts, configmap mounts, token mounts etc. Each reflector registers a number of histograms, gauges, etc. and even when the reflector is stopped and removed the metrics are never removed. After a few days on a busy cluster you get millions of metrics bloating up kubelet and everything else using the reflectors.

aermakov-zalando · 2019-01-31T13:44:18Z

This was broken since v1.12.0, so probably unrelated to #71731.

dims · 2019-01-31T13:45:45Z

correct @aermakov-zalando that PR is only in v1.14.0-alpha.2 v1.14.0-alpha.1 and master

szuecs · 2019-01-31T14:20:11Z

For everyone that finds this issue and needs a patch to disable the reflector metrics:

diff --git c/pkg/util/reflector/prometheus/prometheus.go i/pkg/util/reflector/prometheus/prometheus.go
index 958a0007cd..63657e9c55 100644
--- c/pkg/util/reflector/prometheus/prometheus.go
+++ i/pkg/util/reflector/prometheus/prometheus.go
@@ -85,8 +85,6 @@ func init() {
 	prometheus.MustRegister(watchDuration)
 	prometheus.MustRegister(itemsPerWatch)
 	prometheus.MustRegister(lastResourceVersion)
-
-	cache.SetReflectorMetricsProvider(prometheusMetricsProvider{})
 }
 
 type prometheusMetricsProvider struct{}

feiskyer · 2019-01-31T14:28:21Z

@szuecs Do you have same metrics after disabling ReflectorMetricsProvider?

szuecs · 2019-01-31T14:52:42Z

no, it drops the reflector metrics

yujuhong · 2019-01-31T22:52:18Z

@wojtek-t is this indirectly caused by switching to watch-based managers?

wojtek-t · 2019-02-01T07:19:09Z

Yeah - switching to watch resulted in more extensive using of a reflector.
It seems that those metrics are directly coming from reflector (I wasn't aware of those).
So it's highly probably that this change cause it.

I think that we don't really need those metrics, so if we could switch them off in Kubelets, that should solve the problem.

wojtek-t · 2019-02-01T08:27:53Z

#73624 is sent out for review to fix that.
We should probably cherrypick it to previous releases if we agree that's the right way to solve it.

wojtek-t · 2019-02-01T08:57:05Z

Actually, I realized that I don't fully understand the problem.
It should be a problem in default Kubernetes (what we build from this repo), because those metrics are not switched on for kubelet:

metrics factory is defaulted to noop
the prometheus metrics need to be explicitly linked and initialized (as they are in apiserver e.g.:

kubernetes/cmd/kube-apiserver/app/server.go

Line 81 in d1031b1

_ "k8s.io/kubernetes/pkg/util/reflector/prometheus" // for reflector metric registration

)

aermakov-zalando · 2019-02-01T09:38:43Z

Wouldn't it be better to rewrite the reflector metrics so they're aggregated in a better way rather than relying on people not accidentally enabling them? Or at least put a huge warning on top saying "this will leak memory like crazy" just so the same situation doesn't repeat in other code using client-go?

wojtek-t · 2019-02-01T11:52:20Z

@aermakov-zalando - that's a separate issue whether the metrics are reasonable or not (and node sig-node related).
One thing that I would like to confirm: do you customize k8s yourself? I would like to ensure that this issue you faced is in your setup and not in upstream k8s.

mikkeloscar · 2019-02-01T12:36:39Z

Our production configuration is this one: https://github.com/zalando-incubator/kubernetes-on-aws/tree/beta

Kubelet config: https://github.com/zalando-incubator/kubernetes-on-aws/blob/85f3ddea60900a7880d03e8c9c5ed7b158b100c7/cluster/node-pools/worker-default/userdata.clc.yaml#L144-L207

We run with an image called registry.opensource.zalan.do/teapot/hyperkube:v1.12.5, but it's just a mirror of k8s.gcr.io/hyperkube:v1.12.5. So upstream Kubernetes.

szuecs · 2019-02-01T14:34:52Z

I try to figure out an issue why these metrics are even there and I found issues from the past showing leaks already in older versions:
#52121
#53485

Sounds for, that we need a postmortem to make this not happening again.

I don't find any issue in kubernetes nor in client-go, that could reasonable explain why these were introduced in the first place. I guess I just did not found it, and I hope someone can highlight where the decision came from.

shrinandj · 2019-02-04T23:03:11Z

I am running Kubernetes v1.12.3 and I don't see reflector metrics being used by kubelet (curl -s localhost:10255/metrics | sed 's/{.*//' | sort | uniq -c | sort -nr | grep reflector shows nothing).

FWIW, the cluster is created using kops.

Any idea, how are reflector metrics enabled?

derekwaynecarr · 2019-02-04T23:04:26Z

@sjenning and myself will also look into this as well and see what we can find.

aermakov-zalando · 2019-02-12T09:45:35Z

I would also suggest not using imports for side effects because this usually creates hard to debug and non-obvious problems like this.

feiskyer · 2019-02-12T11:58:39Z

I would also suggest not using imports for side effects because this usually creates hard to debug and non-obvious problems like this.

@aermakov-zalando So you mean disable the reflector metrics totally?

aermakov-zalando · 2019-02-12T13:26:58Z

No, I suggest changing the code so that the end users (kubelet, apiserver, etc) would have to explicitly enable the metrics by calling a function instead of having this happen as a side effect of an import statement.

feiskyer · 2019-02-13T04:41:21Z

explicitly enable the metrics by calling a function instead of having this happen as a side effect of an import statement.

LGTM. @wojtek-t @derekwaynecarr @yujuhong WDYT? We should probably do this for both kubelet and control plane services (apiserver and controller-manager).

wojtek-t · 2019-02-13T07:13:52Z

I'm fine with that.

yujuhong · 2019-02-13T17:25:52Z

I'm fine with that.

+1

derekwaynecarr · 2019-02-13T19:51:28Z

+1

logicalhan · 2019-02-26T22:12:48Z

Why wouldn't we want to just actually disable this metric wholesale? Enabling this flag would basically be the same thing as saying enable a memory leak, no?

lavalamp · 2019-02-26T23:09:33Z

/assign @logicalhan

lavalamp · 2019-02-26T23:11:01Z

Han points out that, since the metric has a random suffix each time, it's not very useful for monitoring anyway.

lavalamp · 2019-02-26T23:11:35Z

/sig api-machinery

…emory leak ref: kubernetes/kubernetes#73587 ref: kubernetes/kubernetes#74636

…emory leak ref: kubernetes#73587 ref: kubernetes#74636 Origin-commit: 01380498b02d6dee75e52d9ce54e9a5dffef24fb

…emory leak ref: kubernetes/kubernetes#73587 ref: kubernetes/kubernetes#74636 Origin-commit: 01380498b02d6dee75e52d9ce54e9a5dffef24fb Kubernetes-commit: f77a2c16c80223249ead526ca12caa6962117888

…emory leak ref: kubernetes#73587 ref: kubernetes#74636 Origin-commit: 01380498b02d6dee75e52d9ce54e9a5dffef24fb

…emory leak ref: kubernetes/kubernetes#73587 ref: kubernetes/kubernetes#74636 Origin-commit: 01380498b02d6dee75e52d9ce54e9a5dffef24fb Kubernetes-commit: fd85bbcb7e0922b8889c85fad1f5f2d4ca7a3fa7

szuecs added the kind/bug Categorizes issue or PR as related to a bug. label Jan 31, 2019

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 31, 2019

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 31, 2019

This was referenced Jan 31, 2019

Hotfix/alpha zalando-incubator/kubernetes-on-aws#1783

Closed

hotfix for kubelet into alpha zalando-incubator/kubernetes-on-aws#1785

Merged

aermakov-zalando mentioned this issue Jan 31, 2019

kubelet: disable reflector metrics zalando-incubator/kubernetes-on-aws#1782

Merged

yujuhong added this to the v1.14 milestone Jan 31, 2019

yujuhong added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 31, 2019

wojtek-t mentioned this issue Feb 1, 2019

Turn of metrics for reflectors created by watch managers #73624

Closed

k8s-ci-robot assigned logicalhan Feb 26, 2019

k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Feb 26, 2019

logicalhan mentioned this issue Feb 27, 2019

Remove reflector metrics since they are causing a memory leak #74636

Merged

k8s-ci-robot closed this as completed in #74636 Feb 27, 2019

rphillips added a commit to rphillips/origin that referenced this issue Feb 27, 2019

UPSTREAM: 74636: Remove reflector metrics as they currently cause a m…

0138049

…emory leak ref: kubernetes/kubernetes#73587 ref: kubernetes/kubernetes#74636

rphillips mentioned this issue Feb 27, 2019

UPSTREAM: 74636: Remove reflector metrics as they currently cause a memory leak openshift/origin#22180

Merged

aermakov-zalando mentioned this issue Feb 28, 2019

Kubernetes v1.12.6 + Use Docker images for individual components instead of hyperkube zalando-incubator/kubernetes-on-aws#1825

Merged

deads2k pushed a commit to deads2k/kubernetes that referenced this issue Mar 14, 2019

UPSTREAM: 74636: Remove reflector metrics as they currently cause a m…

fd85bbc

…emory leak ref: kubernetes#73587 ref: kubernetes#74636 Origin-commit: 01380498b02d6dee75e52d9ce54e9a5dffef24fb

logicalhan mentioned this issue Sep 27, 2019

Fix init order during starup for dynamic kubelet config #83184

Merged

skonto mentioned this issue Feb 15, 2021

Reflector metrics are not available knative/pkg#2019

Closed

danisafta mentioned this issue May 4, 2021

kubelet memory leak v1.18.1 #101724

Closed

wu8685 mentioned this issue Oct 23, 2021

Reflector has no metrics collected kubernetes/client-go#1027

Closed

szuecs mentioned this issue May 26, 2022

REQUEST: New membership for szuecs kubernetes/org#3442

Closed

9 tasks

chenk008 mentioned this issue Feb 5, 2024

client-go: Add metrics into informer kubernetes/enhancements#4346

Open

7 tasks

deads2k mentioned this issue Feb 5, 2024

KEP-4346: client-go: Add informer metrics kubernetes/enhancements#4349

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak in kubelet 1.12.5 #73587

memory leak in kubelet 1.12.5 #73587

szuecs commented Jan 31, 2019 •

edited

szuecs commented Jan 31, 2019

Shnatsel commented Jan 31, 2019

k8s-ci-robot commented Jan 31, 2019

aermakov-zalando commented Jan 31, 2019 •

edited

aermakov-zalando commented Jan 31, 2019

aermakov-zalando commented Jan 31, 2019

dims commented Jan 31, 2019

szuecs commented Jan 31, 2019

feiskyer commented Jan 31, 2019

szuecs commented Jan 31, 2019

yujuhong commented Jan 31, 2019

wojtek-t commented Feb 1, 2019

wojtek-t commented Feb 1, 2019

wojtek-t commented Feb 1, 2019

aermakov-zalando commented Feb 1, 2019 •

edited

wojtek-t commented Feb 1, 2019

mikkeloscar commented Feb 1, 2019 •

edited

szuecs commented Feb 1, 2019 •

edited

shrinandj commented Feb 4, 2019

derekwaynecarr commented Feb 4, 2019

aermakov-zalando commented Feb 12, 2019

feiskyer commented Feb 12, 2019

aermakov-zalando commented Feb 12, 2019 •

edited

feiskyer commented Feb 13, 2019

wojtek-t commented Feb 13, 2019

yujuhong commented Feb 13, 2019

derekwaynecarr commented Feb 13, 2019

logicalhan commented Feb 26, 2019

lavalamp commented Feb 26, 2019

lavalamp commented Feb 26, 2019

lavalamp commented Feb 26, 2019

memory leak in kubelet 1.12.5 #73587

memory leak in kubelet 1.12.5 #73587

Comments

szuecs commented Jan 31, 2019 • edited

szuecs commented Jan 31, 2019

Shnatsel commented Jan 31, 2019

k8s-ci-robot commented Jan 31, 2019

aermakov-zalando commented Jan 31, 2019 • edited

aermakov-zalando commented Jan 31, 2019

aermakov-zalando commented Jan 31, 2019

dims commented Jan 31, 2019

szuecs commented Jan 31, 2019

feiskyer commented Jan 31, 2019

szuecs commented Jan 31, 2019

yujuhong commented Jan 31, 2019

wojtek-t commented Feb 1, 2019

wojtek-t commented Feb 1, 2019

wojtek-t commented Feb 1, 2019

aermakov-zalando commented Feb 1, 2019 • edited

wojtek-t commented Feb 1, 2019

mikkeloscar commented Feb 1, 2019 • edited

szuecs commented Feb 1, 2019 • edited

shrinandj commented Feb 4, 2019

derekwaynecarr commented Feb 4, 2019

aermakov-zalando commented Feb 12, 2019

feiskyer commented Feb 12, 2019

aermakov-zalando commented Feb 12, 2019 • edited

feiskyer commented Feb 13, 2019

wojtek-t commented Feb 13, 2019

yujuhong commented Feb 13, 2019

derekwaynecarr commented Feb 13, 2019

logicalhan commented Feb 26, 2019

lavalamp commented Feb 26, 2019

lavalamp commented Feb 26, 2019

lavalamp commented Feb 26, 2019

szuecs commented Jan 31, 2019 •

edited

aermakov-zalando commented Jan 31, 2019 •

edited

aermakov-zalando commented Feb 1, 2019 •

edited

mikkeloscar commented Feb 1, 2019 •

edited

szuecs commented Feb 1, 2019 •

edited

aermakov-zalando commented Feb 12, 2019 •

edited