[Performance Improvement] Pool `gzip.Writer` and `bytes.Buffer` #19

pakio · 2024-02-28T14:29:24Z

Summary

Added sync.Pool to pool gzip.Writer and bytes.Buffer for better performance.

Background

While I was investigating profile report of my application, noticed transport client is not pooling gzip writer and buffer.
As discussed in this Reddit thread and many other places, pooling them will benefit us by reusing allocated memory space rather than allocating and freeing them on every request.

Performance Comparison

These are the benchmark result on my machine (Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz)

Compress body

	ns/op	B/op	allocs/op
Before	129744	815064	32
After(Without pooling)	117462	815014	31
After(With poolong)	20462	1152	12

bench result

Running tool: /usr/local/bin/go test -benchmem -run=^$ -bench ^BenchmarkTransport$ github.com/elastic/elastic-transport-go/v8/elastictransport

goos: darwin
goarch: amd64
pkg: github.com/elastic/elastic-transport-go/v8/elastictransport
cpu: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz

=== RUN   BenchmarkTransport/Compress_body_(pool:_false)
BenchmarkTransport/Compress_body_(pool:_false)
BenchmarkTransport/Compress_body_(pool:_false)-8                    9450            117462 ns/op          815014 B/op         31 allocs/op
=== RUN   BenchmarkTransport/Compress_body_(pool:_true)
BenchmarkTransport/Compress_body_(pool:_true)
BenchmarkTransport/Compress_body_(pool:_true)-8                    57690             20462 ns/op            1152 B/op         12 allocs/op

====================
=== before the change ===
====================

=== RUN   BenchmarkTransport/Compress_body
BenchmarkTransport/Compress_body
BenchmarkTransport/Compress_body-8                    9697            129744 ns/op          815064 B/op         32 allocs/op

…ance

pakio · 2024-02-29T10:08:10Z

Hi @Anaethelion, would you mind reviewing this PR whenever you are available?
I've been using this library via go-elasticsearch in production and this change would really help our application run more stable with high traffic volume environment.

Anaethelion · 2024-03-08T08:37:03Z

Hi @pakio

I've taken some time to review and evaluate this PR.
While this seems like a worthy improvement for some use cases, it turns out that this could hurt a lot others and create some contention around the pool which could lead to OOM.

I'm not going to merge this in its current state, however I'm interested to know if you have more examples of real world use cases beyond the benchmark ? How does it help you application being more stable ? What kind of usage do you have where this helps ? Search or Bulk indexing ?

We still have the option of maintaining the current behavior and adding the pool as an optional behavior.

pakio · 2024-03-10T06:13:20Z

hi @Anaethelion , thanks for your feedback!

While this seems like a worthy improvement for some use cases, it turns out that this could hurt a lot others and create some contention around the pool which could lead to OOM.

Understood. Just to clarify, does this mean you've actually experienced OOM with sync.Pool or with this implementation, or have general concern about it?

however I'm interested to know if you have more examples of real world use cases beyond the benchmark

sure, here are some examples of use cases of sync.Pool along with gzip.NewWriter

grpc-go: link
prometheus golang client: link

And our team have also been using the feature on the platform for more than a year and is operating without any problem.

How does it help you application being more stable ? What kind of usage do you have where this helps ? Search or Bulk indexing?

Our team uses go-elasticsearch for both data indexing pipeline and search application.
While I was exploring the CPU time profile graph of our search application, interestingly more than 5% of the time were consumed on elastictransport.Perform, specifically to allocate memory for copying bytes to buffers . The size of requests we sending is generally around 3~5KB. Unfortunately we don't have profile for indexing pipeline but using esutil.BulkIndexer to batch-send requests with threshold of 10MB, therefore I suspect the impact would be larger.

We still have the option of maintaining the current behavior and adding the pool as an optional behavior.

I'm happy to make this change as well as adding more practical test case using actual json file, but firstly I would like to understand your concern regarding OOM by using sync.Pool so that I can properly address the issue.

Anaethelion · 2024-03-11T17:20:25Z

Understood. Just to clarify, does this mean you've actually experienced OOM with sync.Pool or with this implementation, or have general concern about it?

This is just general concern around edge cases while pushing the BulkIndexer within its limits, I have a feeling this could lead to memory exhaustion if the machine that runs the indexing tries to do other tasks and the pool has grown too big.
My point is, it could break existing usages and I'd like not to do that. :)

sure, here are some examples of use cases of sync.Pool along with gzip.NewWriter

Thank you!

Our team uses go-elasticsearch for both data indexing pipeline and search application. While I was exploring the CPU time profile graph of our search application, interestingly more than 5% of the time were consumed on elastictransport.Perform, specifically to allocate memory for copying bytes to buffers . The size of requests we sending is generally around 3~5KB. Unfortunately we don't have profile for indexing pipeline but using esutil.BulkIndexer to batch-send requests with threshold of 10MB, therefore I suspect the impact would be larger.

Got you. It matches what I've benchmarked locally as well. The greater the payload, the greater the gain which makes sense. At the same time for search heavy use cases I've found this could slow down the client depending on the compression that has been set up.

I'm happy to make this change as well as adding more practical test case using actual json file, but firstly I would like to understand your concern regarding OOM by using sync.Pool so that I can properly address the issue.

That concern only led me to consider making this more pluggable at least at first and then in time or for a new major we can flip the switch and make that the default. But as much as we can I'd like not to break existing behavior.

pakio · 2024-03-12T01:13:30Z

I have a feeling this could lead to memory exhaustion if the machine that runs the indexing tries to do other tasks and the pool has grown too big.

It sounds a fair concern.
Will update the PR soon to make it optional, and will ping you when it's ready!

pakio · 2024-03-12T14:10:08Z

I've made a few changes on my implementation

Moved gzip compression logic to separate file
Added gzipCompressor interface, added simpleGzipCompressor and pooledGzipCompressor inherits the interface
Added Config.PoolCompressor to determine which compressor to use from client

I also found a bug in the benchmark that 1. reader was always at the end of the cursor except first loop 2. tp was always recreated in the bench which means the pool is not utilized properly. I updated them as well as the benchmark result in the description.

ptal when you get a chance! @Anaethelion

Anaethelion

LGTM thank you @pakio for the great addition!

pakio added 2 commits February 28, 2024 23:02

add benchmark for request body compression

b37c566

use sync.Pool to pool gzip.Writer and bytes.Buffer for better perform…

a48ad02

…ance

pakio changed the title ~~[Performance Improvement] Pool and reuse gzip.Writer and bytes.Buffer~~ [Performance Improvement] Pool gzip.Writer and bytes.Buffer Feb 29, 2024

propargate error from pool.Get

7e2dccb

Anaethelion self-assigned this Feb 29, 2024

Anaethelion self-requested a review March 7, 2024 11:38

separate gzip and call compress from transport

9438f6b

pakio force-pushed the pool-gzip-writer branch from d93c6a5 to 9438f6b Compare March 12, 2024 14:02

Anaethelion added 2 commits March 13, 2024 18:35

add header to gzip.go

b76a243

config: add comment for PoolCompressor boolean

2605dfd

Anaethelion approved these changes Mar 14, 2024

View reviewed changes

Anaethelion merged commit f7f6e87 into elastic:main Mar 14, 2024
14 checks passed

pakio mentioned this pull request Apr 1, 2024

add PoolCompressor option and propagate it to elastictransport.Config elastic/go-elasticsearch#840

Merged

pakio mentioned this pull request Apr 9, 2024

[FEATURE] Use pooled gzip writer for request body gzip compression opensearch-project/opensearch-go#520

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance Improvement] Pool `gzip.Writer` and `bytes.Buffer` #19

[Performance Improvement] Pool `gzip.Writer` and `bytes.Buffer` #19

pakio commented Feb 28, 2024 •

edited

pakio commented Feb 29, 2024 •

edited

Anaethelion commented Mar 8, 2024

pakio commented Mar 10, 2024 •

edited

Anaethelion commented Mar 11, 2024

pakio commented Mar 12, 2024

pakio commented Mar 12, 2024 •

edited

Anaethelion left a comment

[Performance Improvement] Pool gzip.Writer and bytes.Buffer #19

[Performance Improvement] Pool gzip.Writer and bytes.Buffer #19

Conversation

pakio commented Feb 28, 2024 • edited

Summary

Background

Performance Comparison

Compress body

pakio commented Feb 29, 2024 • edited

Anaethelion commented Mar 8, 2024

pakio commented Mar 10, 2024 • edited

Anaethelion commented Mar 11, 2024

pakio commented Mar 12, 2024

pakio commented Mar 12, 2024 • edited

Anaethelion left a comment

Choose a reason for hiding this comment

[Performance Improvement] Pool `gzip.Writer` and `bytes.Buffer` #19

[Performance Improvement] Pool `gzip.Writer` and `bytes.Buffer` #19

pakio commented Feb 28, 2024 •

edited

pakio commented Feb 29, 2024 •

edited

pakio commented Mar 10, 2024 •

edited

pakio commented Mar 12, 2024 •

edited