expose zstd compression #1326

heinerstilz · 2022-08-24T14:03:13Z

Database name

PostgreSQL

Pull request description

Describe what this PR fix

The zstd compression library is already supported internally in WAL-G. It seems to have been disabled because of a data corruption bug in zstd that has meanwhile been resolved (DataDog/zstd#39).
It has been mentioned that it might be good to bring zstd back (see discussion on #300).

This PR

exposes the zstd compression option to the user
upgrades to a more recent version
adjusts some docs.

In our measurements with PostgreSQL and WAL-G on a ~270 GB dataset, zstd was significantly less CPU intensive than brotli:
1.6x less real time and 1.25x less user time backing up/compressing with WALG_UPLOAD_CONCURRENCY=8
1.6x less real time and 2.8x less user time restoring/decompressing.
Compression ratio was around 2.4x for both.

Another measurement by a different party that seems to support zstd's performance advantage over brotli: https://peazip.github.io/fast-compression-benchmark-brotli-zstandard.html#:~:text=Comparing%20Brotli%20and%20Zstandard%20extraction,twice%20as%20fast%20as%20Brotli.

Please provide steps to test this PR

Which of the tests makes most sense to be adapted for zstd?

* expose upgraded zstd compressor * mention zstd in docs * zstd 1.5.2 + patches * remove unused lines in go.sum

x4m · 2022-08-24T17:02:22Z

FWIW I saw one report of brotli corruption: a file properly decrypted by gpg, but unexctractable. Probably, due to cosmic rays or something.
Zstd is great codec, best on Pareto frontier. It would be very good to expose it. But...

@heinerstilz do you validate you backups? Will you use Zstd? We need someone who will warn us if something is still wrong with Zstd.

heinerstilz · 2022-08-29T13:37:59Z

Implementing our backup solution for PostgreSQL, the idea is to go with zstd unless we find an issue with it.
We'll likely soon have a large volume of frequent Postgres backups in production. Usable restores are needed on a regular basis.
So should zstd still corrupt data, there is quite a good chance we'll notice (and of course flag it here).

unlock zstd compression (wal-g#1)

a663323

* expose upgraded zstd compressor * mention zstd in docs * zstd 1.5.2 + patches * remove unused lines in go.sum

heinerstilz requested a review from a team as a code owner August 24, 2022 14:03

serprex approved these changes Aug 24, 2022

View reviewed changes

x4m merged commit 25616a7 into wal-g:master Aug 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expose zstd compression #1326

expose zstd compression #1326

heinerstilz commented Aug 24, 2022 •

edited

x4m commented Aug 24, 2022

heinerstilz commented Aug 29, 2022

expose zstd compression #1326

expose zstd compression #1326

Conversation

heinerstilz commented Aug 24, 2022 • edited

Database name

Pull request description

Describe what this PR fix

Please provide steps to test this PR

x4m commented Aug 24, 2022

heinerstilz commented Aug 29, 2022

heinerstilz commented Aug 24, 2022 •

edited