Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSetting a ulimit on virtual memory causes compaction failures and subsequent data corruption #5135
Comments
This comment has been minimized.
This comment has been minimized.
|
may be somewhat related to #4392 |
This comment has been minimized.
This comment has been minimized.
|
#4392 looks like there just isn’t enough space to mmap all the blocks required? My virtual memory hard ulimit is unlimited, but a soft ulimit is important as Prometheus isn’t the only thing running on this machine. Does anyone know what dictates when blocks are munmap’d? |
This comment has been minimized.
This comment has been minimized.
|
Any time Prometheus opens a Block. Compaction opens Blocks, db.reload() when the tsdb is opened on start opens Blocks, cleaning tombstones, etc. |
krasi-georgiev
added
component/local storage
kind/more-info-needed
and removed
kind/more-info-needed
labels
Jan 25, 2019
This comment has been minimized.
This comment has been minimized.
|
it is safe to delete all but one of the overlapping blocks with the 1548325500000-1548325800000 ranges as these are duplicates. I don't think any data is corrupted just that the db couldn't be reloaded due to running out of memory. we have a long standing PR to add a scan command to the tsdb cli tool and if this one gets merged it would be easier to recover after such cases. |
This comment has been minimized.
This comment has been minimized.
|
The current code is already meant to handle overlaps gracefully, did that break? |
This comment has been minimized.
This comment has been minimized.
|
This one @krasi-georgiev ? prometheus/tsdb#320 Thanks - I'll have a read through. Are all blocks mmap'd at startup? I'm trying to understand the memory requirements so I can play nicely with others. |
This comment has been minimized.
This comment has been minimized.
|
Yes, everything is mmaped at startup. |
This comment has been minimized.
This comment has been minimized.
|
@sandyteenan yes sorry , forgot to link it. we run benchmarks occasionally and it seems that during compaction it needs around 10% more memory. You can look at the latest bench test here: the Prometheus dashboard shows around 50% jumps but the node exporter dashboards show around 10% jumps. Not sure why this is but 10% looks close to what we can expect. |
sandyteenan commentedJan 24, 2019
•
edited
Proposal
Use case. Why is this important?
Running out of memory during compaction causes data corruption.
Bug Report
What did you do?
Set a ulimit to restrict the virtual memory available to Prometheus on a shared host. In our main instance, the ulimit is currently 40GB. For repro purposes, the ulimit was set to 230000 bytes.
To recreate the issue quickly, used the parameters --storage.tsdb.min-block-duration 5m --storage.tsdb.max-block-duration 5m
Scrape intervals were also set low (2s) for the purposes of repro.
What did you expect to see?
Clean termination of the prometheus process when an OOM condition is encountered.
What did you see instead? Under which circumstances?
Corruption of data directories, resulting in an inability to restart prometheus.
Rapid creation of new folders in the data directory of
prometheus-2.6.0-datadircreationafteroom.txt
Environment
Linux 3.10.0-957.1.3.el7.x86_64 x86_64
prometheus, version 2.6.0 (branch: HEAD, revision: dbd1d58)
build user: root@bf5760470f13
build date: 20181217-15:14:46
go version: go1.11.3
global:
scrape_interval: 2s
scrape_configs:
static_configs:
static_configs:
static_configs:
prometheus-2.6.0-ulimit-compactionfailed.txt
prometheus-2.6.0-datacorruptionaftercompactionerrors.txt
prometheus-2.6.0-pprofheap.txt