Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upInvestigate RAM consumption during crash recovery #2139
Comments
beorn7
self-assigned this
Oct 31, 2016
beorn7
referenced this issue
Nov 14, 2016
Merged
Fix possible memory leak by defer inside loop #2184
This comment has been minimized.
This comment has been minimized.
|
Random observation: A beefy Prometheus server seemed to ramp up its RAM usage during rebuilding the metrics index (xxx metrics queued for indexing). |
This comment has been minimized.
This comment has been minimized.
|
Wild guess: If LevelDB gets a lot of updates, it might run into trouble cleaning up and hogs too much RAM. |
This comment has been minimized.
This comment has been minimized.
|
I have decided to not tackle the LevelDB issues. This will be hairy at best, and it is going away in v2.0 anyway. |
beorn7
referenced this issue
Apr 3, 2017
Merged
storage: Evict unused chunk.Descs in crash recovery #2561
beorn7
closed this
Apr 3, 2017
beorn7
referenced this issue
Aug 8, 2017
Closed
Crash recovery uses too much memory compared to target-heap-size #3038
estahn
referenced this issue
Nov 7, 2018
Closed
Crash recovery OOM kills prometheus-server container #4833
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
lock
bot
locked and limited conversation to collaborators
Mar 23, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
beorn7 commentedOct 31, 2016
We have received occasional reports of servers OOMing during crash recovery.
Obviously, the checkpoint has to be loaded in its' entirety, but if more is loaded from disk, it could explain the OOMing as no series maintenance or chunk eviction is running. After a quick check, I could only see chunk descs being loaded. In extreme cases, even the relatively small chunk descs might cause an OOM, so unloading chunk descs will definitely be a way to reduce RAM usage during crash recovery.
But there might be other code paths where chunks might be loaded. This has to be investigated more thoroughly.
Obviously, having #447 in place would come in handy.
@matthiasr as discussed earlier today.