-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Create a child process with high oom score to avoid OOMKiller #48429
Description
(This is an experimental and controversial feature, but it can be good to try).
An issue
ClickHouse process can be killed by OOMKiller. Currently, there is no good way to avoid it. Usually, total memory usage is limited by (max_server_memory_usage_to_ram_ratio = 0.9) * available_memory. However, there are cases when max_server_memory_usage_to_ram_ratio = 0.7 does not help.
Current approach
- Global memory tracker tracks (almost) all the allocations.
- Every second total memory usage is adjusted from the process RSS
- Every memory allocation is checked over (
last RSS+delta of allocations)
The idea is:
- We use system metric, so that we are safe if memory accountment is not accurate
- We apply delta on top, so that very fast memory allocation is handled
Why current approach does not work well
- There is a race in between internal and system memory accounting
- There are places which block memory expections
- There is a code which mmaps memory directly (so we don't account it)
Feature
Let's create a child process that consumes a relatively small amount of memory and has a high oom score.
Watch this process. In case if child process is killed
- Enable special mode. E.g.: throw exceptions for every non-blocked allocation, try to kill heavy processes, etc
- Collect and dump all helpful introspection: running queries, memory usage, stacktraces, etc
- Restart a child process
Most likely child process should only write 10000 to /proc/self/oom_score_adj.
Alternative solutions
TODO
(use cgroups v2 or something; not always possible/convenient)