New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zstandard compression option for wlog #11666
Add Zstandard compression option for wlog #11666
Conversation
5aa4204
to
e05ce6b
Compare
Oop, thanks for pointing that out @roidelapluie! I think I'll leave it for now as I'm not sure it's clear-cut whether it's desirable to include zstd as an option even. If we do decide we'd like to, I'll revisit and get that fixed up. |
How does WAL replay compare with this compression? |
/prombench main I am curious to see how it affects write latency. |
I agree, we need to keep the flags backwards compatible. But I think it's great to add bettter compression options. The most important improvements to me are on-disk size, and startup speed reduction. |
/prombench cancel No noticeable difference in write latency or any spikes |
Benchmark cancel is in progress. |
Yay, cool to see that there's some interest in this! As far as the flag change goes, I'm thinking I'll go back and revert the I'll also work on grabbing some benchmark numbers for WAL replay (is this also what plays into startup speed reduction?). |
Oop, and just thought of this regarding the prombench thing - this branch still defaults to snappy so I'm not sure it would have showed much of a difference. I'll also plan on temporarily defaulting it to zstd and re-run prombench to see what it looks like. |
Yup, WAL replay speed would be interesting to compare. We chose snappy originally due to the decompress speed. |
It might also be interesting to try |
🤦 oh right. Yes, let's use zstd by default for now. We can revert it back later after prombench is done with it. You can hardcode it for the test. |
b650fad
to
bddd6b8
Compare
/prombench main |
@leizor is not a org member nor a collaborator and cannot execute benchmarks. |
/prombench main |
/prombench cancel Things are looking fine. CPU usage is tiny bit higher, but nothing concerning. Although the BenchmarkLoadWAL is not promising. This is comparing snappy and zstd.
|
Benchmark cancel is in progress. |
Yes, ztsd is slower, more CPU heavy than snappy. Snappy was originally chosen to provide reasonable compression with minimal overhead. One interesting option is adding to our snappy implementation with klauspost S2. |
I did throw together a benchmark for S2 after I was disappointed with how zstd performed. I might've done something wrong though as it turned out to be slower than snappy still. I'll add it back in so we can double-check though! |
I'm not surprised that zstd is slower, and that's fine. It's still a good option to have for users that are willing to trade off CPU / speed for better compression. I think this feature is worth having, but maybe we stick with snappy as the default for now until we get some more experience with zstd. |
Alright, in that case, sounds good! I think I'll just rollback that commit setting zstd as default and resolve the merge conflicts then, and this PR can be considered ready for review. I think that way we can address the other compression options in follow-up PRs? |
bddd6b8
to
4920330
Compare
Hmmm, I mostly hear complaints about how slow the WAL replay is, and not how huge it is. I am not sure if we want to add zstd. @SuperQ I might not be exposed to the relevant audience; do you hear many users wanting this and will be ok with this trade-off? I can post it in prometheus-developers mailing list and get feedback if needed. |
Yes, I hear about WAL replay being slow. But I don't think decompression is really a major source of the slowdown. For example, one source of WAL replay slowdown is happens when there is a problem where Prometheus gets into a crash loop (OOM, etc) and generates many (12+ hours) of WAL files. But Prometheus tries to replay all of these in a single go, never stopping to compact them out. So the problem gets worse, because it's trying to play everything back into memory no matter how old. I'm pretty sure this is still a problem. |
True, I understand that we should be able to do something about very long WAL, but we are mixing different things here. Zstd won't fix that (the memory problem). Replay of 2-3h WAL is still not the best it can be. Do we really want to try out a much slower WAL replay with zstd, which will make the 10-12h WAL problem worse by delaying the OOM that is eventually gonna happen? |
cc @SuperQ @codesome @jesusvazquez can we come to a conclusion here whether we want this? |
I would like to see support for zstd, higher compression ratios with only a bit more overhead. |
Signed-off-by: Justin Lei <justin.lei@grafana.com>
Signed-off-by: Justin Lei <justin.lei@grafana.com>
4920330
to
85492c1
Compare
I would like not like to block this one, so since @SuperQ feels strongly about this, we can try it out! It is anyway behind a flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur with @codesome, we are fine having this under a flag. I'll have a thorough review tomorrow at the test side of this.
One thing I have noticed is that the lib we're importing for the ztsd compression https://github.com/klauspost/compress also has a snappy implementation claiming to be more performant than golang/snappy. I think this is something to investigate in a different issue/PR 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me a bit longer than expected but I have reviewed the changes and I think the test coverage is good enough. Specially considering this is only enabled optionally and snappy remains the default compression algorithm.
One note of clarification is that there are no plans to change the default algorithm, it will continue to be snappy for now.
Nice work @leizor and everyone else that came by 💪
I would love to see benchmark comparisons of |
Agreed! |
#11223
This PR experiments with using the Zstandard compression algorithm as an option for the WAL alongside the existing option of compressing using Snappy.
In order to compare, I've converted about 16GB of Snappy-compressed WAL into Zstandard-compressed WAL and non-compressed WAL:
Benchmarks w/ the new Zstandard compression option added in:
In summary, it looks like zstd yielded a ~37.7% improvement in storage over snappy but is ~76.2% slower. I suspect throughput is more important than storage size for this application so I've left snappy as the default compression method, but in cases where the opposite trade-off is desired, this PR allows the compression method to be switched to zstd.
Signed-off-by: Justin Lei justin.lei@grafana.com