Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find right combination of parameters for ZSTD best_compression #108863

Open
salvatore-campagna opened this issue May 21, 2024 · 1 comment
Open

Comments

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented May 21, 2024

Description

As a result of the investigation activity we conducted after introducing ZSTD, we agreed that we need to find a better set of parameters for the best_compression codec. This is the codec we are going to use in LogsDB. We need to try out less aggressive settings when it comes to storage footprint reduction. The goal is to find the "sweet spot" that allows us to grab the benefits in terms of storage without sacrificing anything in terms of latency so to avoid regression in query latency, dashboard loading and so on.

This means we will need to try a different set of parameters, starting with decreasing the block size, going to, maybe, 128k or 64k. Ideally we would like to keep the compression level as it is but we might need to change it too. Keep in mind anyway, that the choice of these parameters affects both ZSTD CPU and memory usage which we need to measure to avoid finding a "sweet spot" that is too hungry in terms of memory and/or CPU usage. We would like to avoid a "sweet spot" that is good in terms of query latency and storage footprint but that would impact our (hardware) resource usage, with consequences on costs.

When it comes to CPU usage, indexing throughput and search latency are good ways to evaluate where we stand in terms of CPU usage, but we need to track also memory usage, something that we are missing at the moment in our Rally benchmarks. Maybe we could track memory usage by attaching a profiler.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants