-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: GC pacing exhibits strange behavior with a low GOGC #37927
Comments
Change https://golang.org/cl/223937 mentions this issue: |
@pijng Would you be willing to try out https://golang.org/cl/223937 and to see if it helps? |
@gopherbot Please open a backport to 1.14. The comment above suggests that we should do this to fix GOGC in 1.14. |
Backport issue(s) opened: #37928 (for 1.14). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
@mknyszek Should this be for 1.14.1 or can it wait until 1.14.2? |
@mknyszek I did the same tests as before. Attaching log just in case. If additional actions/logs are needed – I'm always ready. |
I believe we hit this issue in a recent upgrade of a high QPS production service to go 1.14. On startup, this service makes a few very large heap allocations and then dials down GOGC. During normal request processing operation it makes a high rate of smaller allocations. We found it is now OOMing where previously it did not. I can write a constrained example program to demonstrate this behavior if it would help? |
@y3llowcake that would be great! Let me know if you need any help; I think I've nailed down the conditions for reproducing this, but it's a little tricky to construct the right benchmark. @ianlancetaylor After talking to some folks (and you), it looks like the 1.14.1 release is very close and we shouldn't delay it more. We'll get this in 1.14.2. |
I didn't get a chance to write up that program today, but figured I would at least share the snippet below. It's taken directly from the service in question and is the function we are using to adjust GOGC.
|
Change https://golang.org/cl/225637 mentions this issue: |
… maxTriggerRatio Currently, the capping logic for the GC trigger ratio is such that if gcpercent is low, we may end up setting the trigger ratio far too high, breaking the promise of SetGCPercent and GOGC has a trade-off knob (we won't start a GC early enough, and we will use more memory). This change modifies the capping logic for the trigger ratio by scaling the minTriggerRatio with gcpercent the same way we scale maxTriggerRatio. For #37927. Fixes #37928. Change-Id: I2a048c1808fb67186333d3d5a6bee328be2f35da Reviewed-on: https://go-review.googlesource.com/c/go/+/223937 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> (cherry picked from commit d1ecfcc) Reviewed-on: https://go-review.googlesource.com/c/go/+/225637 Reviewed-by: David Chase <drchase@google.com>
I have a service on 1.14.2, also with a very high QPS, and started experiencing this after upgrading. |
@y3llowcake did 1.14.2 solve your issues? |
@edaniels when you say "this" do you mean you set GOGC < 100 and you see a spike in memory use? If you're still seeing problems, please file a new issue (if you haven't already), since the patch here seemed to solve a specific problem (which appears to have been the same problem as #37525). More detail would also be helpful. |
We're seeing hosts go out of memory that have about 61.3GB total. Here's a memory dump when there's about 5GB left:
In go 1.13.x, we had virtually identical QPS and alloc rate. It appears right now when the alloc rate is sustained at a high rate long enough, we end up running out of memory even though there are objects that are available to be GCd. |
@edaniels Thanks for the Could you please file a new issue? Now that you're sharing details, it'll be a lot easier to reference all the related details if they're concentrated on a new one, rather than continuing the conversation on a closed (and possibly unrelated) issue. A few questions: Do you set GOGC < 100? I'm trying to determine if you might be experiencing the same problem as the one described in this issue. How did you determine that there were objects available to be GC'd? What platform are you running on? Getting a GC trace and scavenge trace would probably be the most useful thing now ( |
1.14.2 fixed the issue for me, thank you |
As of Go 1.14, GC pacing changed a bit to alleviate issues with an increased allocation rate, related to #35112.
Unfortunately, the pacing change that was made was made in error. Currently, the code for capping the trigger ratio in
gcSetTriggerRatio
looks like:Consider the case where
gcpercent == 3
as in the Go 1.14 log found here. If the input trigger ratio is < 0.6, then we'll set it at 0.6... but if it's greater, then we'll always set it to 0.95*3/100, or 0.0285. Given thatgcpercent
is 3, the latter is the correct behavior. In fact, when we see a 0.6 in agcpacertrace
, we also see a spike in heap use compared to Go 1.13. This is not good, since we're not respecting the trade-off.I think the fix is to scale the minTriggerRatio just like the maxTriggerRatio, though it probably stops making sense to have a minimum at some point (it gets so close to 0 anyway).
We should fix this and probably backport it to Go 1.14. This breaks the promise of GOGC in some cases, which I think counts as a correctness bug?
CC @aclements @randall77
The text was updated successfully, but these errors were encountered: