sync: Pool example suggests incorrect usage #23199
Comments
I should also note that if #22950 is done, then usages like this will cause large buffers to be pinned forever since this example has a steady state of |
Here's an even worse situation than earlier (suggested by @bcmills): pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}
processRequest := func(size int) {
b := pool.Get().(*bytes.Buffer)
time.Sleep(500 * time.Millisecond) // Simulate processing time
b.Grow(size)
pool.Put(b)
time.Sleep(1 * time.Millisecond) // Simulate idle time
}
// Simulate a steady stream of infrequent large requests.
go func() {
for {
processRequest(1 << 28) // 256MiB
}
}()
// Simulate a storm of small requests.
for i := 0; i < 1000; i++ {
go func() {
for {
processRequest(1 << 10) // 1KiB
}
}()
}
// Continually run a GC and track the allocated bytes.
var stats runtime.MemStats
for i := 0; ; i++ {
runtime.ReadMemStats(&stats)
fmt.Printf("Cycle %d: %dB\n", i, stats.Alloc)
time.Sleep(time.Second)
runtime.GC()
} Rather than a single one-off large request, let there be a steady stream of occasional large requests intermixed with a large number of small requests. As this snippet runs, the heap keeps growing over time. The large request is "poisoning" the pool such that most of the small requests eventually pin a large capacity buffer under the hood. |
Yikes. My goal in adding the example was to try to show the easiest-to-understand use case for a Pool. |
The solution is of course to put only buffers with small byte slices back into the pool. if b.Cap() <= 1<<12 {
pool.put(b)
} |
Alternatively, you could use an array of |
There are many possible solutions: the important thing is to apply one of them. A related problem can arise with goroutine stacks in conjunction with “worker pools”, depending on when and how often the runtime reclaims large stacks. (IIRC that has changed several times over the lifetime of the Go runtime, so I'm not sure what the current behavior is.) If you have a pool of worker goroutines executing callbacks that can vary significantly in stack usage, you can end up with all of the workers consuming very large stacks even if the overall fraction of large tasks remains very low. |
Do you have any suggestions for better use cases we could include in the example, that are reasonably compact? Maybe the solution is not to recommend a sync.Pool at all anymore? This is my understanding from a comment I read about how GC makes this more or less useless |
Would changing the example to use an array (fixed size) rather than a slice solve this problem?
I legitimately don't think there ever was a time to generally recommend Sorry to interject randomly, but I saw this thread on Twitter and have strong opinions on this feature. |
We would certainly like to get to this point, and the GC has improved a lot, but for high-churn allocations with obvious lifetimes and no need for zeroing,
That's clearly true, but even right now it's partly by chance that these examples are eventually dropping the large buffers. And in the more realistic stochastic mix example, it's not clear to me that #22950 would make it any better or worse. I agree with @dsnet's original point that we should document that |
We've used sync.Pool for dealing with network packets and others do too (such as lucas-clemente/quic-go) because for those use cases you gain performance when using them. However, in those cases The same we did even for structs such as when parsing packets to a struct |
I guess a minimal example of such a usage would be: package main
import (
"sync"
"net"
)
const MaxPacketSize = 4096
var bufPool = sync.Pool {
New: func() interface{} {
return make([]byte, MaxPacketSize)
},
}
func process(outChan chan []byte) {
for data := range outChan {
// process data
// Re-slice to maximum capacity and return it
// for re-use. This is important to guarantee that
// all calls to Get() will return a buffer of
// length MaxPacketSize.
bufPool.Put(data[:MaxPacketSize])
}
}
func reader(conn net.PacketConn, outChan chan []byte) {
for {
data := bufPool.Get().([]byte)
n, _, err := conn.ReadFrom(data)
if err != nil {
break
}
outChan <- data[:n]
}
close(outChan)
}
func main() {
N := 3
var wg sync.WaitGroup
outChan := make(chan []byte, N)
wg.Add(N)
for i := 0; i < N; i++ {
go func() {
process(outChan)
wg.Done()
}()
}
wg.Add(1)
conn, err := net.ListenPacket("udp", "localhost:10001")
if err != nil {
panic(err.Error())
}
go func() {
reader(conn, outChan)
wg.Done()
}()
wg.Wait()
} but of course... whether this is going to be faster than the GC depends on how many packets per second you have to deal with and what exactly you do with the data etc. In real world you'd benchmark with GC/sync.Pool and compare the two. At the time we wrote our code there was a significant time spent allocating new stuff and using a scheme as above we've managed to increase the throughput. Of course, one should re-benchmark this with every update to the GC. |
Change https://golang.org/cl/136035 mentions this issue: |
Change https://golang.org/cl/136115 mentions this issue: |
Change https://golang.org/cl/136116 mentions this issue: |
The current usage of sync.Pool is leaky because it stores an arbitrary sized buffer into the pool. However, sync.Pool assumes that all items in the pool are interchangeable from a memory cost perspective. Due to the unbounded size of a buffer that may be added, it is possible for the pool to eventually pin arbitrarily large amounts of memory in a live-lock situation. As a simple fix, we just set a maximum size that we permit back into the pool. We do not need to fix the use of a sync.Pool in scan.go since the free method has always enforced a maximum capacity since the first commit of the scan logic. Fixes #27740 Updates #23199 Change-Id: I875278f7dba42625405df36df3e9b028252ce5e3 Reviewed-on: https://go-review.googlesource.com/136116 Reviewed-by: Bryan C. Mills <bcmills@google.com> Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
We (HelloFresh) hit a similar issue to this with one of our services, where a number of large logs using |
I added the example here and frankly I think the best course of action is to remove it. I'm not happy with the number of errors that have been reported and this class of error is particularly difficult/frustrating to debug. I think you are right that we should make |
The operation of
sync.Pool
assumes that the memory cost of each element is approximately the same in order to be efficient. This property can be seen by the fact thatPool.Get
returns you a random element, and not the one that has "the greatest capacity" or what not. In other words, from the perspective of thePool
, all elements are more or less the same.However, the
Pool
example storesbytes.Buffer
objects, which have an underlying[]byte
of varying capacity depending on how much of the buffer is actually used.Dynamically growing an unbounded buffers can cause a large amount of memory to be pinned and never be freed in a live-lock situation. Consider the following:
Depending on timing, the above snippet takes around 35 GC cycles for the initial set of large requests (2.5GiB) to finally be freed, even though each of the subsequent writes only use around 1KiB. This can happen in a real server handling lots of small requests, where large buffers allocated by some prior request end up being pinned for a long time since they are not in
Pool
long enough to be collected.The example claims to be based on
fmt
usage, but I'm not convinced thatfmt
's usage is correct. It is susceptible to the live-lock problem described above. I suspect this hasn't been an issue in most real programs sincefmt.PrintX
is typically not used to write very large strings. However, other applications ofsync.Pool
may certainly have this issue.I suggest we fix the example to store elements of fixed size and document this.
\cc @kevinburke @LMMilewski @bcmills
The text was updated successfully, but these errors were encountered: