-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: clearpools causes excessive STW1 time #22331
Comments
/cc @aclements |
Change https://golang.org/cl/166957 mentions this issue: |
Change https://golang.org/cl/166958 mentions this issue: |
Change https://golang.org/cl/166959 mentions this issue: |
Change https://golang.org/cl/166960 mentions this issue: |
This is the first step toward fixing multiple issues with sync.Pool. This adds a fixed size, lock-free, single-producer, multi-consumer queue that will be used in the new Pool stealing implementation. For #22950, #22331. Change-Id: I50e85e3cb83a2ee71f611ada88e7f55996504bb5 Reviewed-on: https://go-review.googlesource.com/c/go/+/166957 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
This adds a dynamically sized, lock-free, single-producer, multi-consumer queue that will be used in the new Pool stealing implementation. It's built on top of the fixed-size queue added in the previous CL. For #22950, #22331. Change-Id: Ifc0ca3895bec7e7f9289ba9fb7dd0332bf96ba5a Reviewed-on: https://go-review.googlesource.com/c/go/+/166958 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
This adds two benchmarks that will highlight two problems in Pool that we're about to address. The first benchmark measures the impact of large Pools on GC STW time. Currently, STW time is O(# of items in Pools), and this benchmark demonstrates 70µs STW times. The second benchmark measures the impact of fully clearing all Pools on each GC. Typically this is a problem in heavily-loaded systems because it causes a spike in allocation. This benchmark stresses this by simulating an expensive "New" function, so the cost of creating new objects is reflected in the ns/op of the benchmark. For #22950, #22331. Change-Id: I0c8853190d23144026fa11837b6bf42adc461722 Reviewed-on: https://go-review.googlesource.com/c/go/+/166959 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Our application uses a large number of goroutines, mutexes, and defers. Using go 1.8.3 on linux amd64, we observed that the STW1 time is dominated by the
runtime.clearpools
function in mgc.go. Here is an excerpt from the GC trace:By instrumenting the runtime, we determined that, out of the 16ms STW1 time, 9.3ms was spent in the loop that disconnects
sched.sudogcache
linked list, and 6.7ms was spent in the loop that disconnects the linked lists insched.deferpool
. These O(n) loops end up causing a very long pause.I understand the reason for originally introducing the loops in #9110. However, looking at tip around the call sites of
releaseSudog
andfreedefer
, I couldn't see a case where a released object would still be referenced by a live pointer somewhere else in the system. Are these loops still really necessary?FWIW, I replaced the loops with a simple zeroing of the heads of the linked lists, and test/fixedbugs/issue9110.go still passed.
I propose making these loops optional to avoid the excessive STW1 time, and enabling them only with a GODEBUG option for debugging purposes.
The text was updated successfully, but these errors were encountered: