Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive WAL and block overlap #5476

Open
mknapphrt opened this Issue Apr 17, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@mknapphrt
Copy link
Contributor

mknapphrt commented Apr 17, 2019

Bug Report

(I don't know if this is really a bug or just a misunderstanding on WAL truncation)

What did you do?
No changes were made, just running prometheus.

What did you expect to see?
When the WAL compactions are triggered and blocks are created, the WAL that now overlaps with the blocks would be deleted.

What did you see instead? Under which circumstances?
There are several hours of WAL data still overlapping with the blocks. Taken at the same time:

$ ls -lah | tail
drwxr-xr-x   3 prom  prom     6 Apr 17 03:41 01D8N1GPC7F6SNHZYZA5CJ89HZ
drwxr-xr-x   3 prom  prom     6 Apr 17 05:40 01D8N8CDM7VGSACTSMNGB1KA2T
drwxr-xr-x   3 prom  prom     6 Apr 17 07:40 01D8NF85A9H74HAB2KYG8YWT8T
drwxr-xr-x   3 prom  prom     6 Apr 17 09:44 01D8NP3W4KKJPEKY4W5KKPH8P2
drwxr-xr-x   3 prom  prom     6 Apr 17 11:44 01D8NWZKCTQPH4DM108T93TKXJ
drwxr-xr-x   3 prom  prom     6 Apr 17 13:46 01D8P3VAMR9CBSZTQ036J1221B
drwxr-xr-x   3 prom  prom     4 Apr 17 15:00 01D8PAQ1WWBFNR3T05VRFH574Y.tmp
$ ls -lah wal | head
total 136G
drwxr-xr-x   3 prom  prom 2.0K Apr 17 15:04 .
drwxr-xr-x 177 prom  prom  178 Apr 17 15:00 ..
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642783
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642784
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642785
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642786
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642787
-rw-r--r--   1 prom  prom 128M Apr 17 09:51 00642788
-rw-r--r--   1 prom  prom 128M Apr 17 09:52 00642789

Should there be this much overlap? Is there a flag or setting that can control how much overlap there is? I don't really see the point in this redundancy in data storage. Any help or suggestions would be appreciated. Thanks

Environment

  • Prometheus version: 2.6.1
@cstyan

This comment has been minimized.

Copy link
Contributor

cstyan commented Apr 17, 2019

@mknapphrt it depends on how many segments you have. I suspect you have quite a lot, given how many segments you have in just that one pasted comment for a 1-2 minute time period.

When TSDB Head takes the last 2h of data and then a truncation is performed, at most the first 1/3 of segment files will be removed. A checkpoint is created with records from those segments for series that are still active, and then an attempt to delete those segment files.

https://github.com/prometheus/tsdb/blob/master/head.go#L556-L561

@cstyan

This comment has been minimized.

Copy link
Contributor

cstyan commented Apr 17, 2019

Note: I'm assuming you haven't messed with any of the tsdb flags :) if you have, please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.