Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in Kubernetes discovery #4095

Closed
grobie opened this Issue Apr 17, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@grobie
Copy link
Member

grobie commented Apr 17, 2018

Bug Report

What did you do?

Ran Prometheus v2.2.1 plus latest race patches (v2.2.1...f8dcf9b) for several days.

What did you expect to see?

Stable memory footprint.

What did you see instead? Under which circumstances?

Memory leak, daily crashes.

screenshot from 2018-04-17 16-32-47

Environment

  • System information:

    Linux 4.4.10+soundcloud x86_64

  • Prometheus version:

    custom built from f8dcf9b

  • Prometheus configuration file:

    22 jobs (17 kubernetes_sd from a big cluster, 1 ec2_sd, 2 dns_sd, 2 static_config)

Profiles

pprof top 10 comparison

Showing nodes accounting for 1743.66MB, 36.60% of 4764.52MB total
Dropped 220 nodes (cum <= 23.82MB)
      flat  flat%   sum%        cum   cum%
  502.03MB 10.54% 10.54%   453.03MB  9.51%  github.com/prometheus/prometheus/vendor/github.com/ugorji/go/codec.(*jsonDecDriver).DecodeString
  333.68MB  7.00% 17.54%    27.52MB  0.58%  github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeBits
  327.53MB  6.87% 24.41%    29.50MB  0.62%  github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*memSeries).cut
 -294.53MB  6.18% 18.23%  -294.53MB  6.18%  github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.NewXORChunk (inline)
  262.15MB  5.50% 23.74%   269.65MB  5.66%  github.com/prometheus/prometheus/vendor/k8s.io/client-go/pkg/api/v1.codecSelfer1234.decResourceList
 -216.12MB  4.54% 19.20%  -216.12MB  4.54%  github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeByte (inline)
 -147.56MB  3.10% 16.10%  -147.56MB  3.10%  github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeBit (inline)
  134.66MB  2.83% 18.93%   134.66MB  2.83%  reflect.unsafe_NewArray
  132.13MB  2.77% 21.70%   776.30MB 16.29%  github.com/prometheus/prometheus/vendor/k8s.io/client-go/pkg/api/v1.codecSelfer1234.decSliceContainer
  130.57MB  2.74% 24.44%   144.37MB  3.03%  github.com/prometheus/prometheus/scrape.(*scrapeLoop).append

@grobie grobie changed the title Memory leak in Prometheus Kubernetes discovery Memory leak in Kubernetes discovery Apr 17, 2018

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 25, 2018

Built from 2cbba4e now (finally a version without known data races so that we can rule out races as a chaos source). Still seing the memory leak. New interesting finding: The K8s pod targets were not updated (Prometheus tried to scrape pods that didn't exist anymore and failed to scrape the new pods). I could fix that by sending SIGHUP which then also brought the escalating memory usage back to normal levels.

This seems to have to do with something in K8s SD getting stuck.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 30, 2018

Fixed by #4117 .

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.