New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MaxBlockDuration is 31 days when only using size based retention configuration #6857
Comments
That seems right to me, we want the default to be 31d in that case. |
I currently have a size based cutoff (50GB), and the first retention clear out deleted almost all my metric history, because largest block contained almost all my metric history. Maybe 50GB is pretty small compared to a usual case, but its quite unexpected to loose almost all my history. Clearly i can set both a time based and a size based retention to 'fix' this as it will force smaller block sizes, but its not obvious. Maybe it just a case of updating the docs to make this clear? |
If you want to keep 30d of history, you're going to need ~60d of disk space given how everything works. Changing the retention period doesn't really change that. |
@brian-brazil - I don't think that is true. Surely if storage.tsdb.retention.time = 34d then this bit of code from cmd/prometheus/main.go comes into effect: I have exactly the same problem as @richardwilko I am currently only using size based retention and I see blocks created which are as big as 45% of my retention policy. Whereas ideally this would never be larger than 10%. Would it not be possible to implement a similar strategy for max block size as is currently done for database retention, ie consider both size and length and take whichever limit is hit first? I am going to take a look and see if I can put together a PR for that, but obviously that would not be worth doing if you don't think this is a valid approach. |
@richardwilko - my current workaround is to set |
The problem is that if there's a size configured but no time, we have no idea what the size translates to in time terms. |
So its not possible to work out how much data has been written to a block and stop writing once you hit 10% of you storage.tsdb.retention.size? Does that mean that the storage.tsdb.retention.size setting is intended to only ever be used in conjunction with storage.tsdb.retention.time? |
That's not how compaction works, once we've chosen to compact we work series by series rather than in time slices. |
Reviewing this at the bug scrub, we agreed both with the sentiment that 31 days is a very big block for most people, and that Prometheus can't easily target a size in bytes. Some suggestions came up in discussion:
|
@bboreham I know every body loves their own bugs the most. But while reading your suggestions I remembered running into and reporting yet another issue in relation to size based retention: #11112. My issue is more about running out of disk space doing |
cfg.tsdb.MaxBlockDuration defaults to 31 days, not 10% of 31 days when only using size based retention.
Bug is in on line 313 of prometheus\cmd\prometheus\main.go, the 10% scaling only applied when the retention duration is non-zero, but retention duration is zero when setting only storage.tsdb.retention.size
Currently an issue on master
The text was updated successfully, but these errors were encountered: