Add zstd compression #1278

danlaine · 2023-03-31T20:52:55Z

Still needs resolution to the following:

Do we want to keep compression as a binary disabled/enabled? Or allow user to specify compression type?
For existing UT, should we use zstd, or duplicate tests for both gzip and zstd?

Note this changes metric names from [message type]_compress_time / [message type]_decompress_time to zstd_[message type]_compress_time and gzip_[message type]_compress_time. Grafana dashboards will need to be updated accordingly.

Note this deprecates the --network-compression-enabled flag in favor of new --network-compression-type.

Why this should be merged

zstd compression appears significantly faster than gzip, and marginally better at compressing messages.

How this works

Adds new zstd Compressor. Default behavior is still gzip. zstd is forbidden (via config) until v1.10.0. We need to update it so that it's forbidden until network upgrade time passes, but there's no var in the code for that yet.

How this was tested

Existing/new UT

…o-internal into add-zstd-compression

message/messages.go

utils/compression/zstd_compressor.go

message/messages.go

utils/compression/zstd_compressor.go

message/messages.go

StephenButtolph

LGTM - up to you if you want to add the log or not.

joshua-kim · 2023-04-05T21:25:16Z

utils/compression/zstd_compressor.go

+	if int64(len(decompressed)) > z.maxSize {
+		return nil, fmt.Errorf("%w: (%d) > (%d)", ErrDecompressedMsgTooLarge, len(decompressed), z.maxSize)


Why do we return an error here? At this point we've already decompressed the payload so it seems like a waste to drop this message - i think it makes sense to limit the size of the thing we're compressing or the size of the thing we're decompressing but it feels strange to limit the size of the result of the decompression

if we don't do this error handling then i think we can also not do the weird extra byte allocation in the earlier line

We did not decompress the message in this case. We stopped decompressing the message because it may have been a zip bomb.

g1mv · 2023-04-06T16:43:58Z

During hypersdk load testing, I found that avalanchego spends the majority of time performing compression-related tasks (zstd should help a ton here):

Showing nodes accounting for 16600ms, 71.55% of 23200ms total
Dropped 486 nodes (cum <= 116ms)
Showing top 10 nodes out of 233
      flat  flat%   sum%        cum   cum%
    3040ms 13.10% 13.10%     7990ms 34.44%  compress/flate.(*compressor).deflate
    2500ms 10.78% 23.88%     3170ms 13.66%  compress/flate.(*decompressor).huffSym
    2160ms  9.31% 33.19%     2160ms  9.31%  runtime.memmove
    1820ms  7.84% 41.03%     1820ms  7.84%  crypto/sha256.block
    1730ms  7.46% 48.49%     1730ms  7.46%  runtime/internal/syscall.Syscall6
    1550ms  6.68% 55.17%     1670ms  7.20%  github.com/golang/snappy.encodeBlock
    1460ms  6.29% 61.47%     2290ms  9.87%  compress/flate.(*compressor).findMatch
     820ms  3.53% 65.00%      820ms  3.53%  compress/flate.matchLen (inline)
     760ms  3.28% 68.28%      760ms  3.28%  compress/flate.(*dictDecoder).writeByte
     760ms  3.28% 71.55%      760ms  3.28%  runtime.memclrNoHeapPointers

Out of interest, do you have a sample of the data you're compressing/decompressing (anything available, even a bulk network dump)? After a peek at the code it does seems to be mainly network messaging data, if I am not mistaken?

danlaine · 2023-04-06T16:49:00Z

During hypersdk load testing, I found that avalanchego spends the majority of time performing compression-related tasks (zstd should help a ton here):

Showing nodes accounting for 16600ms, 71.55% of 23200ms total
Dropped 486 nodes (cum <= 116ms)
Showing top 10 nodes out of 233
      flat  flat%   sum%        cum   cum%
    3040ms 13.10% 13.10%     7990ms 34.44%  compress/flate.(*compressor).deflate
    2500ms 10.78% 23.88%     3170ms 13.66%  compress/flate.(*decompressor).huffSym
    2160ms  9.31% 33.19%     2160ms  9.31%  runtime.memmove
    1820ms  7.84% 41.03%     1820ms  7.84%  crypto/sha256.block
    1730ms  7.46% 48.49%     1730ms  7.46%  runtime/internal/syscall.Syscall6
    1550ms  6.68% 55.17%     1670ms  7.20%  github.com/golang/snappy.encodeBlock
    1460ms  6.29% 61.47%     2290ms  9.87%  compress/flate.(*compressor).findMatch
     820ms  3.53% 65.00%      820ms  3.53%  compress/flate.matchLen (inline)
     760ms  3.28% 68.28%      760ms  3.28%  compress/flate.(*dictDecoder).writeByte
     760ms  3.28% 71.55%      760ms  3.28%  runtime.memclrNoHeapPointers

Out of interest, do you have a sample of the data you're compressing/decompressing (anything available, even a bulk network dump)? After a peek at the code it does seems to be mainly network messaging data, if I am not mistaken?

Can't speak to the composition of messages during the test Patrick mentioned, but yes, this compression is only used for P2P messages.

Dan Laine added 30 commits March 2, 2023 15:34

WIP

0b089ef

remove max size from NewZstdCompressor

e4e513b

WIP support multiple compression types

ef84d36

rename compressionType to Type

4a659b0

fix metric

c6de927

WIP remove CompressionEnabled and add --network-compression-type

13eb890

rename types

bc55ea4

add zstd compression/decompression metrics

400a918

Merge remote-tracking branch 'origin/dev' into add-zstd-compression

6f8bd3e

don't allow 2 network compression flags

38e7a12

remove benchmark

3850c79

cleanup

6065be8

don't use zstd until v1.10

d7c5d52

tweak error message

385ba27

nit

37bcf39

nits

81afb71

nits

4da5f21

Merge remote-tracking branch 'origin/dev' into add-zstd-compression

d8a9ccf

flag wording nit

d9b778c

add zstd tests; fix bugs

41bc6c5

consolidate metrics

11c171d

remove old todo

ddd35ee

update test

6dd3f25

update tests

eab4879

add tests

1ad19c1

appease linter

6b29cfc

Merge branch 'dev' into add-zstd-compression

7c150cd

Merge remote-tracking branch 'origin/dev' into add-zstd-compression

874632a

Merge branch 'add-zstd-compression' of github.com:ava-labs/avalancheg…

a9a96f6

…o-internal into add-zstd-compression

nits

9526ba9

StephenButtolph reviewed Apr 4, 2023

View reviewed changes

message/messages.go Outdated Show resolved Hide resolved

message/messages.go Outdated Show resolved Hide resolved

message/messages.go Outdated Show resolved Hide resolved

utils/compression/zstd_compressor.go Outdated Show resolved Hide resolved

Dan Laine added 5 commits April 4, 2023 15:03

use stream interface for Decompress to avoid unzip bomb

53df4c6

fix copy pasta bug

adbc82a

move switch case to default

e510398

remove impossible switch case

82d7f68

appease linter

bdcf189

StephenButtolph reviewed Apr 4, 2023

View reviewed changes

message/messages.go Outdated Show resolved Hide resolved

utils/compression/zstd_compressor.go Show resolved Hide resolved

utils/compression/zstd_compressor.go Outdated Show resolved Hide resolved

Dan Laine added 3 commits April 4, 2023 15:57

make reader a local var

b91159d

return nit

c259199

add invalid max size check to zstd compressor creation

9105741

danlaine commented Apr 4, 2023

View reviewed changes

message/messages.go Outdated Show resolved Hide resolved

Merge branch 'dev' into add-zstd-compression

8ad8fe6

StephenButtolph approved these changes Apr 4, 2023

View reviewed changes

StephenButtolph added this to the v1.10.0 (Cortina) milestone Apr 4, 2023

Dan Laine and others added 7 commits April 5, 2023 09:47

log warning for unknown op during metrics observation

c6f4800

test cleanup

812a37d

appease linter

a0c1dda

remove magic number

26e7235

Merge branch 'dev' into add-zstd-compression

a2cc314

imports nit

436685b

nit

b40e14b

StephenButtolph approved these changes Apr 5, 2023

View reviewed changes

nit

dfc909d

gyuho approved these changes Apr 5, 2023

View reviewed changes

joshua-kim reviewed Apr 5, 2023

View reviewed changes

StephenButtolph merged commit 529d7be into dev Apr 5, 2023

StephenButtolph deleted the add-zstd-compression branch April 5, 2023 23:47

hexfusion pushed a commit to hexfusion/avalanchego that referenced this pull request Apr 11, 2023

vms/avm: add tx serialization test case with signers (ava-labs#1278)

ae4a50a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add zstd compression #1278

Add zstd compression #1278

danlaine commented Mar 31, 2023

StephenButtolph left a comment

joshua-kim Apr 5, 2023 •

edited

Loading

joshua-kim Apr 5, 2023

StephenButtolph Apr 5, 2023

g1mv commented Apr 6, 2023

danlaine commented Apr 6, 2023

		if int64(len(decompressed)) > z.maxSize {
		return nil, fmt.Errorf("%w: (%d) > (%d)", ErrDecompressedMsgTooLarge, len(decompressed), z.maxSize)

Add zstd compression #1278

Add zstd compression #1278

Conversation

danlaine commented Mar 31, 2023

Why this should be merged

How this works

How this was tested

StephenButtolph left a comment

Choose a reason for hiding this comment

joshua-kim Apr 5, 2023 • edited Loading

Choose a reason for hiding this comment

joshua-kim Apr 5, 2023

Choose a reason for hiding this comment

StephenButtolph Apr 5, 2023

Choose a reason for hiding this comment

g1mv commented Apr 6, 2023

danlaine commented Apr 6, 2023

joshua-kim Apr 5, 2023 •

edited

Loading