Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Ctx compress/decompress #83

Merged
merged 3 commits into from
Jun 17, 2020
Merged

Added Ctx compress/decompress #83

merged 3 commits into from
Jun 17, 2020

Conversation

merlimat
Copy link
Contributor

Motivation

According to https://facebook.github.io/zstd/zstd_manual.html#Chapter4 when doing repeated compression/decompression operations, it's recommended to use a context object for internal state to be preserved.

  When compressing many times,
  it is recommended to allocate a context just once,
  and re-use it for each successive compression operation.
  This will make workload friendlier for system's memory.
  Note : re-using context is just a speed / resource optimization.
         It doesn't change the compression ratio, which remains identical.
  Note 2 : In multi-threaded environments,
         use one different context per thread for parallel execution.

While this could be done similarly through the stream API, it doesn't come natural when compressing a []byte and it doesn't provide a way to provide a dst buffer for the result.

Modifications

In order to expose the ZSTD_compressCCtx() and ZSTD_decompressDCtx(), adding a Ctx interface:

type Ctx interface {
	// Compress src into dst.  If you have a buffer to use, you can pass it to
	// prevent allocation.  If it is too small, or if nil is passed, a new buffer
	// will be allocated and returned.
	Compress(dst, src []byte) ([]byte, error)

	// CompressLevel is the same as Compress but you can pass a compression level
	CompressLevel(dst, src []byte, level int) ([]byte, error)

	// Decompress src into dst.  If you have a buffer to use, you can pass it to
	// prevent allocation.  If it is too small, or if nil is passed, a new buffer
	// will be allocated and returned.
	Decompress(dst, src []byte) ([]byte, error)

	io.Closer
}

Example:

ctx := zstd.NewCtx()

out1, err := ctx.Compress(nil, input1)
out2, err := ctx.Compress(nil, input2)
// ...

ctx.Close()

Microbenchmark

BenchmarkCtxCompression
BenchmarkCtxCompression-16         	     207	   5189899 ns/op	 345.87 MB/s
BenchmarkCtxDecompression
    BenchmarkCtxDecompression: zstd_ctx_test.go:166: Reduced from 1795030 to 119090
    BenchmarkCtxDecompression: zstd_ctx_test.go:166: Reduced from 1795030 to 119090
    BenchmarkCtxDecompression: zstd_ctx_test.go:166: Reduced from 1795030 to 119090
BenchmarkCtxDecompression-16       	    1548	    679185 ns/op	2642.92 MB/s
BenchmarkStreamCompression
BenchmarkStreamCompression-16      	    5766	    230298 ns/op	7794.38 MB/s
BenchmarkStreamDecompression
BenchmarkStreamDecompression-16    	    1048	   1578035 ns/op	1137.51 MB/s
BenchmarkCompression
BenchmarkCompression-16            	     150	   7384646 ns/op	 243.08 MB/s
BenchmarkDecompression
    BenchmarkDecompression: zstd_test.go:189: Reduced from 1795030 to 119090
    BenchmarkDecompression: zstd_test.go:189: Reduced from 1795030 to 119090
    BenchmarkDecompression: zstd_test.go:189: Reduced from 1795030 to 119090
BenchmarkDecompression-16          	    1357	    783583 ns/op	2290.80 MB/s

The benchmark shows that reusing the context improves the overall performance:

  • Compression : 243 MB/s --> 345 MB/s
  • Decompression: 2290 MB/s --> 2642 MB/s

Copy link
Collaborator

@Viq111 Viq111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @merlimat!
This looks good, I just added some comments around SetFinalizer vs a Close pattern.

Thanks as well for adding the tests and benchmarks, really highlight the usefulness of this addition.

zstd_ctx.go Outdated Show resolved Hide resolved
zstd_ctx.go Show resolved Hide resolved
zstd_ctx.go Outdated
}

func (c *ctx) Close() error {
if err := getError(int(C.ZSTD_freeCCtx(c.cctx))); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably try to free both and return the first error if any:

err1 := getError(int(C.ZSTD_freeCCtx(c.cctx)))
err2 := getError(int(C.ZSTD_freeDCtx(c.dctx)))
if err1 != nil {
  return err1
}
return err2

That way you don't prevent the second context to be freed if the first fails.

This should also probably be gated by a boolean to make sure we don't call freeCCtx twice on the same pointer.

As pointed earlier though, I think in that particular case, a finalizer might be a better fit since all we do is freeing memory, it would be more user-friendly, not end of the world if the finalizer is not called and you are sure it's only called once

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing with finalize function, I don't know what to do with the error codes since we're not bubbling it up. Is there any convention on how to log errors?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point.
Usually what I've seen in the go ecosystem is libraries use the standard log package (examples: zookeeper, sarama)
But most of the time, main program use another logging system (logrus, zap) so it's hard to provide standard logs.

For the current case, the error is not very actionable (if the free fails) so I think we are ok not logging it.
Need to revisit this though if we ever need to add additional logging

merlimat and others added 2 commits June 16, 2020 12:55
Co-authored-by: Vianney Tran <vianney.tran@datadoghq.com>
@merlimat
Copy link
Contributor Author

@Viq111 Thanks for the feedback, I've changed to use finalizer.

Copy link
Collaborator

@Viq111 Viq111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, this looks great!

@Viq111
Copy link
Collaborator

Viq111 commented Jun 16, 2020

I'll merge it tomorrow morning EST (since it's nearing end of business day now)

The benchmark on CircleCI gives the following benchmark:
https://app.circleci.com/pipelines/github/DataDog/zstd/38/workflows/16e654d5-bced-4e74-af20-a20fd83e05c9/jobs/122/steps

BenchmarkCtxCompression-36         	      10	 147802605 ns/op	  67.46 MB/s
BenchmarkCtxDecompression-36       	     100	  15460314 ns/op	 644.91 MB/s
--- BENCH: BenchmarkCtxDecompression-36
    zstd_ctx_test.go:158: Reduced from 9970564 to 3402985
    zstd_ctx_test.go:158: Reduced from 9970564 to 3402985
BenchmarkStreamCompression-36      	      10	 154451392 ns/op	  64.55 MB/s
BenchmarkStreamDecompression-36    	     100	  19827954 ns/op	 502.85 MB/s
BenchmarkCompression-36            	      10	 143375341 ns/op	  69.54 MB/s
BenchmarkDecompression-36          	     100	  16590654 ns/op	 600.97 MB/s
--- BENCH: BenchmarkDecompression-36
    zstd_test.go:189: Reduced from 9970564 to 3402985
    zstd_test.go:189: Reduced from 9970564 to 3402985
PASS

So still see the improvement but not as much since mr is a big payload.

For a small payload (I took zstd.go) on my machine:

ᐅ go test -bench . -run None
goos: darwin
goarch: amd64
pkg: github.com/DataDog/zstd
BenchmarkCtxCompression-16         	   30886	     38165 ns/op	 110.84 MB/s
BenchmarkCtxDecompression-16       	  158637	      7432 ns/op	 569.15 MB/s
BenchmarkCompression-16            	   31852	     40295 ns/op	 104.98 MB/s
BenchmarkDecompression-16          	  160329	      7761 ns/op	 545.05 MB/s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants