Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBENGINE: support ZSTD compression #17244

Merged
merged 6 commits into from Mar 25, 2024
Merged

Conversation

ktsaou
Copy link
Member

@ktsaou ktsaou commented Mar 24, 2024

  • split dbengine compression to separate files
  • add support for ZTSD compression
  • automatically select ZSTD when available
  • when compression fails, or when compression generates a bigger payload, fallback to uncompressed pages
  • handle decompression errors, by considering them read errors

Statistics:

Using the samples of the dbengine unit test.
The number of samples generated are 184549376 (about 185 million).

Page Type Compression .ndf Files (Bytes / MiB) Per Sample Percentage Comments
Array 32-bit (raw) none 760369152 (725.14) 4.12 100% the baseline
Array 32-bit lz4 762527744 (727.20) 4.12 100%+ slightly bigger
Array 32-bit zstd 582234112 (555.26) 3.15 76.6% saved 23.4%
Gorilla 32-bit none 342605824 (326.73) 1.86 45.1% saved 54.9%
Gorilla 32-bit lz4 35827712 (34.17) 0.19 4.7% saved 95.3%
Gorilla 32-bit zstd 31072256 (29.63) 0.17 4.1% saved 95.9%

Keep in mind that these are incremental values, that are optimal for gorilla compression. However it is prominent that gorilla pages are compressed much higher than raw samples.

In practice, Netdata can achieve 0.6 bytes per sample on the second row (Array 32-bit with LZ4). We need to see in practice what is going to happen with the last one!

Also, the unit test runs faster with gorilla. I guess Gorilla improves significantly the memory bandwidth required or the CPU cache efficiency.

@github-actions github-actions bot added area/database area/build Build system (autotools and cmake). labels Mar 24, 2024
@ktsaou ktsaou merged commit f1c26d0 into netdata:master Mar 25, 2024
142 of 145 checks passed
stelfrag pushed a commit to stelfrag/netdata that referenced this pull request Mar 26, 2024
* extract dbengine compression to separate files

* added ZSTD support in dbengine

* automatically select best compression

* handle decompression errors

* eliminate fatals from compression algorithms; fallback to uncompressed pages if compression fails or generates bigger data

* have the unit test generate many data files

(cherry picked from commit f1c26d0)
@stelfrag stelfrag mentioned this pull request Mar 26, 2024
Ferroin pushed a commit that referenced this pull request Mar 27, 2024
* extract dbengine compression to separate files

* added ZSTD support in dbengine

* automatically select best compression

* handle decompression errors

* eliminate fatals from compression algorithms; fallback to uncompressed pages if compression fails or generates bigger data

* have the unit test generate many data files

(cherry picked from commit f1c26d0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build system (autotools and cmake). area/database
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant