DBENGINE: support ZSTD compression #17244

ktsaou · 2024-03-24T18:46:08Z

split dbengine compression to separate files
add support for ZTSD compression
automatically select ZSTD when available
when compression fails, or when compression generates a bigger payload, fallback to uncompressed pages
handle decompression errors, by considering them read errors

Statistics:

Using the samples of the dbengine unit test.
The number of samples generated are 184549376 (about 185 million).

Page Type	Compression	.ndf Files (Bytes / MiB)	Per Sample	Percentage	Comments
Array 32-bit (raw)	none	760369152 (725.14)	4.12	100%	the baseline
Array 32-bit	lz4	762527744 (727.20)	4.12	100%+	slightly bigger
Array 32-bit	zstd	582234112 (555.26)	3.15	76.6%	saved 23.4%
Gorilla 32-bit	none	342605824 (326.73)	1.86	45.1%	saved 54.9%
Gorilla 32-bit	lz4	35827712 (34.17)	0.19	4.7%	saved 95.3%
Gorilla 32-bit	zstd	31072256 (29.63)	0.17	4.1%	saved 95.9%

Keep in mind that these are incremental values, that are optimal for gorilla compression. However it is prominent that gorilla pages are compressed much higher than raw samples.

In practice, Netdata can achieve 0.6 bytes per sample on the second row (Array 32-bit with LZ4). We need to see in practice what is going to happen with the last one!

Also, the unit test runs faster with gorilla. I guess Gorilla improves significantly the memory bandwidth required or the CPU cache efficiency.

…d pages if compression fails or generates bigger data

* extract dbengine compression to separate files * added ZSTD support in dbengine * automatically select best compression * handle decompression errors * eliminate fatals from compression algorithms; fallback to uncompressed pages if compression fails or generates bigger data * have the unit test generate many data files (cherry picked from commit f1c26d0)

ktsaou added 2 commits March 24, 2024 20:31

extract dbengine compression to separate files

5010f80

added ZSTD support in dbengine

1cc5ebc

ktsaou requested review from Ferroin, vkalintiris and thiagoftsm as code owners March 24, 2024 18:46

github-actions bot added area/database area/build Build system (autotools and cmake). labels Mar 24, 2024

ktsaou added 3 commits March 24, 2024 21:43

automatically select best compression

b51e424

handle decompression errors

d8587e7

eliminate fatals from compression algorithms; fallback to uncompresse…

29514c8

…d pages if compression fails or generates bigger data

github-actions bot added the area/daemon label Mar 24, 2024

have the unit test generate many data files

cedb9a6

github-actions bot removed the area/daemon label Mar 24, 2024

ktsaou merged commit f1c26d0 into netdata:master Mar 25, 2024
142 of 145 checks passed

stelfrag mentioned this pull request Mar 26, 2024

Patch release 1.45.1 #17260

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DBENGINE: support ZSTD compression #17244

DBENGINE: support ZSTD compression #17244

ktsaou commented Mar 24, 2024 •

edited by vkalintiris

DBENGINE: support ZSTD compression #17244

DBENGINE: support ZSTD compression #17244

Conversation

ktsaou commented Mar 24, 2024 • edited by vkalintiris

ktsaou commented Mar 24, 2024 •

edited by vkalintiris