Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove mallocs from Compress/Decompress #31

Merged
merged 3 commits into from
Mar 30, 2018

Conversation

valyala
Copy link
Contributor

@valyala valyala commented Mar 21, 2018

Go allocates a copy of slice headers when they are passed to C functions.
See golang/go#24450 for details.

Avoid this by passing unsafe.Pointer as C.uintptr_t into C functions.

Go allocates a copy of slice headers when they are passed to C functions.
See golang/go#24450 for details.

Avoid this by passing unsafe.Pointer as C.uintptr_t into C functions.

Benchmark results:

$ PAYLOAD=ZSTD_LICENSE go test -run=111 -bench='k[CD]' -benchmem -count=10

name             old time/op    new time/op    delta
Compression-4      21.2µs ± 5%    20.8µs ± 6%      ~     (p=0.190 n=10+10)
Decompression-4    7.37µs ± 5%    6.93µs ± 6%    -5.95%  (p=0.001 n=10+10)

name             old speed      new speed      delta
Compression-4    62.5MB/s ± 5%  63.7MB/s ± 6%      ~     (p=0.190 n=10+10)
Decompression-4   180MB/s ± 5%   191MB/s ± 6%    +6.35%  (p=0.001 n=10+10)

name             old alloc/op   new alloc/op   delta
Compression-4       96.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
Decompression-4     96.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)

name             old allocs/op  new allocs/op  delta
Compression-4        4.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
Decompression-4      4.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
Copy link
Collaborator

@Viq111 Viq111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.
I did some simple tests and benchmarks and it looks good.

I just want to benchmark against bigger payloads over the weekend.
Do you mind doing the tiny change on the other comment ?

@@ -139,6 +139,8 @@ func BenchmarkStreamCompression(b *testing.B) {
if err != nil {
b.Fatalf("Failed writing to compress object: %s", err)
}
// Prevent from unbound buffer growth.
intermediate.Reset()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do the Reset() outside the for loop though because it skew the results if it's in the loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Reset must be in the loop, since otherwise:

  • The performance of the code inside loop will depend on the value of b.N. This contradicts the benchmarking rules and makes the results non-determenistic. See https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go .
  • The amount of required memory for each benchmark run will be proportional to b.N. This may lead to out of memory error on a system with low amount of memory.

The bytes.Buffer.Reset skews the benchmark results by a few nanoseconds. This is a noise comparing to 20 microseconds spent on the data compression at the line 138.

Copy link
Contributor Author

@valyala valyala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to benchmark against bigger payloads over the weekend.

This makes little sense, since the time required for compressing bigger payloads will be much bigger comparing to the time spent on the (eliminated) memory allocations. This patch improves performance for small payloads only.

@@ -139,6 +139,8 @@ func BenchmarkStreamCompression(b *testing.B) {
if err != nil {
b.Fatalf("Failed writing to compress object: %s", err)
}
// Prevent from unbound buffer growth.
intermediate.Reset()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Reset must be in the loop, since otherwise:

  • The performance of the code inside loop will depend on the value of b.N. This contradicts the benchmarking rules and makes the results non-determenistic. See https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go .
  • The amount of required memory for each benchmark run will be proportional to b.N. This may lead to out of memory error on a system with low amount of memory.

The bytes.Buffer.Reset skews the benchmark results by a few nanoseconds. This is a noise comparing to 20 microseconds spent on the data compression at the line 138.

@Viq111
Copy link
Collaborator

Viq111 commented Mar 30, 2018

Yes sorry, I meant test on a large payload corpus over the weekend (vs. a bigger single payload).
Tested with about 1.5Gb of our data (average payload size is a few kB) and verified. The performance difference is pretty negligible for those "big" payloads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants