Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upK8s prometheus container get OOM Killed every 5-10minutes #5019
Comments
This comment has been minimized.
This comment has been minimized.
|
trying to check the heavy queries (https://www.robustperception.io/which-are-my-biggest-metrics) but eventually it just tried to drop metrics like logs only showing |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
It feels like you have metrics with high-cardinality. What's the graph for |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Yep you seem to have metrics with unbounded cardinality. 8M timeseries is a lot for Prometheus and would require roughly 64GB just for storing them into memory. Maybe try to enable scrape configurations one after the other to find the culprit. |
This comment has been minimized.
This comment has been minimized.
|
thanks! i am checking the configuration one by one, hopefully can the one which caused heavy memory usage, will let you know the result |
This comment has been minimized.
This comment has been minimized.
|
turn out this job thanks for pointing to the right direction |


alanh0vx commentedDec 20, 2018
Proposal
Use case. Why is this important?
“Nice to have” is not a good use case. :)
Bug Report
What did you do?
Running prometheus container (v.2.5.0), deployed using k8s
there are 5 nodes of k8s cluster, around 80 pods running
What did you expect to see?
prometheus running smoothly
What did you see instead? Under which circumstances?
K8s prometheus container get OOM Killed every 10minutes
Environment
prometheus v2.5.0
k8s: v1.11.0
cluster node: 32G RAM
cluster node:
Linux 4.4.0-139-generic x86_64pod description: