Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusing/resetting a brotli.Writer instance #1132

Open
ankon opened this issue Feb 12, 2024 · 2 comments
Open

Reusing/resetting a brotli.Writer instance #1132

ankon opened this issue Feb 12, 2024 · 2 comments

Comments

@ankon
Copy link

ankon commented Feb 12, 2024

We saw good performance improvements in our go program when using a sync.Pool to cache and reset gzip writers rather than creating them anew each time.

I was wondering whether the same would be true for the brotli writer, too, but that one doesn't seem to have a trivial "reset this" method. On the underlying C level it looks like an encoder state can be "created"/"destroyed", but there are functions to "initialize" and "cleanup" as well.

Before I try to hack this together: Do you think that pooling/reusing of the encoder state could have a measurable/visible performance impact?

@eustas
Copy link
Collaborator

eustas commented Feb 12, 2024

Initialize and cleanup are sort-of internal. Could you point me to the changes regarding gzip, so I could investigate the source of performance improvements.
I could assume that performance is impacted by JNI roundtrips. If that is true, then we could think of something to reuse / refurbish instances.

@ankon
Copy link
Author

ankon commented Feb 13, 2024

So for gzip a lot of places1,2 suggest this approach (pseudo-golang here):

writerPool := &sync.Pool{
  New: func() any {
	writer, _ := gzip.NewWriterLevel(nil, g.Level)
	return writer
  },
}

// later when using it to wrap an existing output writer
var w io.Writer
// Grab a possibly pooled gzip writer
writer := writerPool.Get().(*gzip.Writer)
// Reset its structures and point it to now write to w
writer.Reset(w)

Doing some benchmarks (see [1] for an idea) also for us showed that the number of allocations went down quite a bit, and I think that might be due to internal caches and buffers getting reused.

See also gzip.Writer's Reset documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants