Use pure Go zstd implementation #1162

narqo · 2019-12-16T11:42:51Z

What version of Go are you using (`go version`)?

$ go version
go version go1.13.4 darwin/amd64

What version of Badger are you using?

latest master

Does this issue reproduce with the latest master?

Yes

As it was mentioned in #1081 #1104 BadgerDB consider switching to a pure Go implementation of zstd (github.com/klauspost/compress) after it went out of Beta.

Version 1.9.4 is no longer marked as beta, refer to https://github.com/klauspost/compress/tree/v1.9.4/zstd#status

STABLE - there may always be subtle bugs, a wide variety of content has been tested and the library is actively used by several projects. This library is being continuously fuzz-tested, kindly supplied by fuzzit.dev.

The text was updated successfully, but these errors were encountered:

jarifibrahim · 2019-12-18T11:09:36Z

We're working on this @narqo. I'll try to make this change as soon as possible.

jarifibrahim · 2020-01-13T12:16:40Z

Hi @narqo , we've decided to not use the pure go based ZSTD because of performance issues. Please see #1176 (comment) and the benchmarks in #1176 (comment) .

You can build badger with CGO_ENABLED=0 to use badger without CGO in which case Snappy would be the default compression algorithm.

johanbrandhorst · 2020-06-18T13:54:03Z

The discussion in #1176 implies to me that this decision should be reconsidered.

mvdan · 2020-06-18T14:02:59Z

I'm also interested in seeing this reopened. I have a package that imports badger/v2, and the indirect dependency on DataDog/zstd is really unfortunate. It takes a long time to build all that C code, and because it's a single package, it means a huge bottleneck in my build time.

jarifibrahim · 2020-06-19T14:10:49Z

I did some benchmarks against both the libraries using the badger benchmark write tool https://github.com/dgraph-io/badger/blob/master/badger/cmd/write_bench.go (compresion is disabled by default, you'll have to enable it) and here's what I found

writing 100 million key-values approx 22 gig of data
Datadog/zstd
	Elapsed (wall clock) time (h:mm:ss or m:ss): 6:23.08
Klauspost/compress
	Elapsed (wall clock) time (h:mm:ss or m:ss): 7:50.20

Datadog/zstd logs

(run master with compression enabled)

badger 2020/06/19 18:21:19 INFO: Running for level: 0
badger 2020/06/19 18:21:25 INFO: LOG Compact 0->1, del 11 tables, add 10 tables, took 5.635393792s
badger 2020/06/19 18:21:25 INFO: Compaction for level: 0 DONE
badger 2020/06/19 18:21:25 INFO: Force compaction on level 0 done
2020/06/19 18:21:25 DB.Close. Error: <nil>. Time taken to close: 8.799659465s
	Command being timed: "go run main.go benchmark write -m 100 --dir ./100m -l"
	User time (seconds): 1585.33
	System time (seconds): 46.84
	Percent of CPU this job got: 426%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 6:23.08
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3015956
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 18
	Minor (reclaiming a frame) page faults: 656889
	Voluntary context switches: 11779130
	Involuntary context switches: 125223
	Swaps: 0
	File system inputs: 2832
	File system outputs: 107017448
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

Klauspost/compress

(run code in https://github.com/dgraph-io/badger/tree/ibrahim/klauspost-compress with compression enabled)

badger 2020/06/19 18:10:28 INFO: LOG Compact 1->2, del 9 tables, add 9 tables, took 5.11088109s
badger 2020/06/19 18:10:28 INFO: Compaction for level: 1 DONE
badger 2020/06/19 18:10:28 INFO: Got compaction priority: {level:0 score:1.73 dropPrefix:[]}
2020/06/19 18:10:28 DB.Close. Error: <nil>. Time taken to close: 24.191639289s
	Command being timed: "go run main.go benchmark write -m 100 --dir ./100m -l"
	User time (seconds): 3697.65
	System time (seconds): 57.36
	Percent of CPU this job got: 798%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 7:50.20
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3046388
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 1
	Minor (reclaiming a frame) page faults: 735124
	Voluntary context switches: 15336179
	Involuntary context switches: 600206
	Swaps: 0
	File system inputs: 32
	File system outputs: 114008192
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

@johanbrandhorst @mvdan I understand that CGO is an evil and should not be used but do you have any strong reasons for switching to pure go based library? I'd like to move away from CGO but the performance difference is what makes me reluctant for this change. I'd love to know your thoughts

mvdan · 2020-06-19T20:25:28Z

For those wondering what the code looks like, it's https://github.com/dgraph-io/badger/compare/ibrahim/klauspost-compress. I'll leave it to @klauspost to comment if the use of his pure Go version could be improved or if it's equivalent to the cgo version.

It might be impossible for compression to beat well optimized C code, even with the cgo cost of DataDog/zstd. But still, we could use this issue to track progress or reevaluate the situation every now and then. To me, this is a tradeoff - I would gladly have 5% slower code that doesn't require cgo with all of its drawbacks, for example. I get that you're seeing a difference larger than a few percent, but I imagine that gap can be made smaller over time.

klauspost · 2020-06-19T21:17:24Z

Great with a real world test. However it doesn't seem too real world.

You seem to be writing random (incompressible) data. That is a very, very limited test of a compressor and I assume the data isn't really compressed at all?
It will always try to entropy compress the literals even if no matches can be found. I will however try to make this dynamic so it will automatically apply to cases in the fastest mode.

For this test you can use the zstd.WithNoEntropyCompression(true) option. For me that doubles the speed with random 4K blocks. I can't remember the semantics used by native zstd.

You should disable CRC. It is disabled by default in datadog IIRC. This is about 5% improvement for me.

Honestly, overall, if it is disabled by default I don't really see a problem. Also it could at least be used as a fallback. You can just use the cgo build tag to en/disable it.

Edit: 2x faster random data in fastest mode: klauspost/compress#270

jarifibrahim · 2020-06-20T07:43:23Z

I made the changes @klauspost suggested and I see very different results now. I made the following two changes

The keys being inserted are 32 bytes long with integers (not random, prefixed with zeros).

     if err := batch.Set([]byte(fmt.Sprintf("%032d", i)), value); err != nil {

Set the WithNoEntropyCompression(true) and EOptions.WithEncoderCRC(false)

Here's what I see now

writing 100 million key-values approx 18 GB of data (earlier it was 22 GB)
Datadog/zstd
	Elapsed (wall clock) time (h:mm:ss or m:ss): 6:58.19
Klauspost/compress
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:40.24

I'm assuming the performance improvement is mostly because the data is no longer random. In real use cases, the data won't be random.

On master

badger 2020/06/20 12:33:35 INFO: Compaction for level: 1 DONE
badger 2020/06/20 12:33:35 INFO: Got compaction priority: {level:0 score:1.73 dropPrefix:[]}
2020/06/20 12:33:35 DB.Close. Error: <nil>. Time taken to close: 1m36.995748667s
	Command being timed: "go run main.go benchmark write -m 100 --dir ./100m -l"
	User time (seconds): 1074.89
	System time (seconds): 24.20
	Percent of CPU this job got: 262%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 6:58.19
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3437000
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 42
	Minor (reclaiming a frame) page faults: 281705
	Voluntary context switches: 11946311
	Involuntary context switches: 91562
	Swaps: 0
	File system inputs: 4288
	File system outputs: 52720896
	Socket message sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

On ibrahim/klauspost-compress branch

badger 2020/06/20 12:41:07 INFO: Got compaction priority: {level:0 score:1.73 dropPrefix:[]}
2020/06/20 12:41:07 DB.Close. Error: <nil>. Time taken to close: 12.916605732s
	Command being timed: "go run main.go benchmark write -m 100 --dir ./100m-k -l"
	User time (seconds): 1010.72
	System time (seconds): 20.27
	Percent of CPU this job got: 303%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 5:40.24
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 2568748
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 2
	Minor (reclaiming a frame) page faults: 309196
	Voluntary context switches: 9775613
	Involuntary context switches: 76950
	Swaps: 0
	File system inputs: 56
	File system outputs: 52799208
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

I think we should switch over to the pure go based implementation. Badger compactions (which will compress data) happen in the background and they can get affected by multiple factors. The compression algorithm wouldn't be the bottleneck.
@klauspost IIUC, I can decompress the data that was created by dataDog/zstd using klauspost/compres (which means they're compatible), is this correct?

@mvdan I understand your point but just out of curiosity, why wouldn't you use CGO in your code?
CGO comes with an associated cost but if there's a significant performance gain, let's say 2X using some C++ library, would you still prefer a go based one? If so, please help me understand why. I'm sorry if this question sounds too trivial, I don't have a lot of experience with CGO.

I also noticed that the number of major page fails (requiring I/O) is much higher in case of datadog/zstd. Is this also a side effect of CGO? I would've expected the page faults to be somewhat similar in both the cases since they're performing the same kind of operation.

klauspost · 2020-06-20T08:33:39Z

IIUC, I can decompress the data that was created by dataDog/zstd using klauspost/compres (which means they're compatible), is this correct?

Yes they are compatible both ways. The only exception is 0 bytes of input which will give 0 bytes output with the Go zstd. But you already have the zstd.WithZeroFrames(true) which will wrap 0 bytes in a header so it can be fed to DD zstd. This will of course only be relevant when downgrading.

I am fuzz testing the change above. It will have much less compression impact than completely disabling entropy coding, but will handle the random input blocks better. I would probably leave out the WithNoEntropyCompression and upgrade to the next version that will select this automatically when it makes sense.

number of major page fails

The dd library allocates a lot. That could be why. Go zstd does not allocate for 4K blocks after a few runs.

why wouldn't you use CGO in your code?

Compiling is much slower. Cross compilation is a pain to set up/impossible for some. Deployment requires dependencies whereas plain Go is just the executable. cgo is inherently less secure since none of the c code has the security features the Go runtime provides.

klauspost · 2020-06-23T10:10:21Z

FYI, I just released v1.10.10 which will automatically disable entropy coding on likely incompressible data in fastest mode.

minhaj-shakeel · 2020-07-20T07:29:10Z

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

narqo mentioned this issue Dec 16, 2019

Add BadgerDB driver k3s-io/kine#16

Open

jarifibrahim added the kind/enhancement Something could be better. label Dec 18, 2019

jarifibrahim mentioned this issue Jan 7, 2020

Use pure Go based ZSTD implementation #1176

Closed

jarifibrahim closed this as completed Jan 13, 2020

jarifibrahim reopened this Jun 18, 2020

This was referenced Jun 20, 2020

build: Pass -64 to goversioninfo on amd64 syncthing/syncthing#6767

Closed

build: Disable Cgo on windows syncthing/syncthing#6768

Merged

jarifibrahim mentioned this issue Jun 23, 2020

Replace Datadog/zstd with Klauspost/compress #1383

Closed

minhaj-shakeel closed this as completed Jul 20, 2020

xiaogaozi mentioned this issue Jan 10, 2022

Use pure Go Zstd implementation juicedata/juicefs#1272

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use pure Go zstd implementation #1162

Use pure Go zstd implementation #1162

narqo commented Dec 16, 2019 •

edited

jarifibrahim commented Dec 18, 2019

jarifibrahim commented Jan 13, 2020

johanbrandhorst commented Jun 18, 2020

mvdan commented Jun 18, 2020

jarifibrahim commented Jun 19, 2020

mvdan commented Jun 19, 2020

klauspost commented Jun 19, 2020 •

edited

jarifibrahim commented Jun 20, 2020

klauspost commented Jun 20, 2020 •

edited

klauspost commented Jun 23, 2020

minhaj-shakeel commented Jul 20, 2020

Use pure Go zstd implementation #1162

Use pure Go zstd implementation #1162

Comments

narqo commented Dec 16, 2019 • edited

What version of Go are you using (go version)?

What version of Badger are you using?

Does this issue reproduce with the latest master?

jarifibrahim commented Dec 18, 2019

jarifibrahim commented Jan 13, 2020

johanbrandhorst commented Jun 18, 2020

mvdan commented Jun 18, 2020

jarifibrahim commented Jun 19, 2020

Datadog/zstd logs

Klauspost/compress

mvdan commented Jun 19, 2020

klauspost commented Jun 19, 2020 • edited

jarifibrahim commented Jun 20, 2020

On master

On ibrahim/klauspost-compress branch

klauspost commented Jun 20, 2020 • edited

klauspost commented Jun 23, 2020

minhaj-shakeel commented Jul 20, 2020

narqo commented Dec 16, 2019 •

edited

What version of Go are you using (`go version`)?

klauspost commented Jun 19, 2020 •

edited

klauspost commented Jun 20, 2020 •

edited