Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upMemory issues on short block sizes #4952
Comments
This comment has been minimized.
This comment has been minimized.
lstanczak
commented
Dec 4, 2018
|
Pprof memory profile from tests: |
krasi-georgiev
added
the
component/local storage
label
Dec 17, 2018
This comment has been minimized.
This comment has been minimized.
|
a duplicate of prometheus/tsdb#480 and we are discussing is there. |
krasi-georgiev
closed this
Dec 26, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
kacper-jackiewicz commentedDec 4, 2018
Bug Report
We are testing Prometheus with Thanos to gather metrics from multiple Openshift clusters. On each we have significant amount of ephemeral pods, which are being scraped by Prometheus. We encounter strange memory consumption pattern during test that resembles our real time metrics almost 1:1. We are able to process millions of metrics when changes for particular label occurs every ~2 hours, while for the changes which are happening more often we see constant increase in memory consumption. We tried to lower retention time to 15 minutes, and block size down to 5 minutes, but this didn't help as a lot as we still were seeing the increase in memory consumption, even after the retention period. Further investigation has shown that short block sizes together with short retention caused degradation of performance of writing the blocks to disk. After 30 minutes of test the 5 minutes block was being flushed longer than 10 minutes, +/- 3m for head GC.
After locking label to change every 2 hours, we could easily use longer blocks and retention periods as memory consumption was stable. Memory consumption peaks at 16GB with average at 11GB for 3M metrics per minute, while for labels which were changing every minute for 1.5M after ~30m it crashed at ~100GB.
In both cases proportions between number of different metrics to number of blocks was constant, which raises a question, why is it behaving like that and how can we tweak the Prometheus to be able to perform better with high cardinality labels (e.g. cron job id / pod name). We are able to access and aggregate over such metrics in Thanos so Prometheus would need to be able to scrape and preserve data only for a time needed to flush block to disk.
What did you expect to see?
We expected to see proportional relation between block sizes, number of different metrics (cardinality / data points), retention and memory consumption.
What did you see instead? Under which circumstances?
Significant memory consumption increase for metrics with high cardinality / variable label
Environment
Openshift v.3.9 with live data
RHEL 6 VM with artificial data (Linux 2.6.32-696.18.7.el6.x86_64 x86_64)
Prometheus version:
Tested for versions:
2.3, 2.4, 2.5
Prometheus configuration file:
3000 jobs, 1 minute scrape interval, 1-3M metrics(total)