Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: heap target increased by significantly more than GOGC should allow #67592

Closed
MikeMitchellWebDev opened this issue May 22, 2024 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.

Comments

@MikeMitchellWebDev
Copy link
Contributor

MikeMitchellWebDev commented May 22, 2024

Go version

go version go1.22.0 darwin/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/mm/Library/Caches/go-build'
GOENV='/Users/mm/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/mm/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/mm/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.22.0'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='/usr/bin/clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/mm/go/src/github.com/MikeMitchellWebDev/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/5y/wtzzmjlj5v52pg7wr8ptbg_m0000gp/T/go-build327897342=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

GODEBUG=gctrace=1, gcpacer=1 GOGC=3 ./myprogram

What did you see happen?

the heap target steadily increased by 3 % and then suddenly by more than 100% at gc 122

gc 118 @10.223s 1%: 0.11+5.8+0.076 ms clock, 0.44+0.18/4.1/2.8+0.30 ms cpu, 349->350->341 MB, 351 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.228282e+000 (scan 1 MB in 350->352 MB) workers=1++0.000000e+000
pacer: 25% CPU (25 exp.) for 1393992+15296+488544 B work (1851744 B exp.) in 367747888 B -> 368612400 B (∆goal -643077, cons/mark +2.721960e-001)
gc 119 @10.411s 1%: 0.70+10+0.10 ms clock, 2.8+0.20/9.9/7.6+0.41 ms cpu, 350->351->342 MB, 352 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.225035e+000 (scan 1 MB in 351->352 MB) workers=1++0.000000e+000
pacer: 26% CPU (25 exp.) for 1412112+15912+488544 B work (1897832 B exp.) in 368469776 B -> 368682136 B (∆goal -1336846, cons/mark +2.721960e-001)
gc 120 @10.533s 1%: 0.10+2.7+0.044 ms clock, 0.41+0.13/2.5/3.5+0.17 ms cpu, 351->351->343 MB, 352 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.230030e+000 (scan 1 MB in 351->353 MB) workers=1++0.000000e+000
pacer: 26% CPU (25 exp.) for 1429360+14784+488544 B work (1916568 B exp.) in 369003952 B -> 369732384 B (∆goal -829715, cons/mark +2.721960e-001)
gc 121 @10.598s 1%: 0.10+2.6+0.066 ms clock, 0.41+0.17/2.3/3.6+0.26 ms cpu, 351->352->343 MB, 353 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+2.949048e+001 (scan 1 MB in 861->861 MB) workers=1++0.000000e+000
gc 122 @10.670s 1%: 0.085+2.3+0.019 ms clock, 0.34+0.15/2.1/3.3+0.078 ms cpu, 861->861->853 MB, 861 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.093284e+000 (scan 4 MB in 877->879 MB) workers=1++0.000000e+000
pacer: 27% CPU (25 exp.) for 2531768+15024+488544 B work (1965256 B exp.) in 920500992 B -> 920976848 B (∆goal -1321715, cons/mark +3.051929e-001)

What did you expect to see?

The heap goal to consistenly increase by 3%

@MikeMitchellWebDev MikeMitchellWebDev changed the title import/path: issue title runtime: heap target increased by significantly more than GOGC should allow May 22, 2024
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 22, 2024
@mknyszek
Copy link
Contributor

It's hard to tell what's going wrong without a way to reproduce or see what myprogram is doing. Are you able to share the code or some code that consistently reproduces the issue? FWIW, to my knowledge, a dramatic spike like this hasn't been reported before (in production or even a test), which makes me curious about what the program is actually doing.

Note that generally, it is possible for the program to overrun the heap goal by at least a little bit. However that overrun is typically visible in the gctrace output (Y in X->Y->Z), whereas this appears to be a change in heap goal that comes from apparently nowhere, akin to if SetGCPercent was called in between GC 121 and GC 122. (Not saying that's what happened, just providing an example of how such output might show up.)

@MikeMitchellWebDev
Copy link
Contributor Author

MikeMitchellWebDev commented May 23, 2024

I can't share code but I can tell you exactly what I did. I used the go-git library to read
the linux repository into memory (it has more than 1 million commits). Furthermore, this issue happened each time I did it. I tried it again several times when I noticed the spike, so it wasn't just a random occurrence. It should be said that it's totally unnecessary to set GOGC=3 to read the repo into memory; it actually performed much better using the default GOGC.

Here's the rest of the gctrace output from gc 123. You can see the heap goal continues incrementing by 3% after that one dramatic spike.

My computer has 8GiB of RAM and GOMAXPROCS was 4.

gc 123 @14.599s 1%: 0.085+4.7+0.045 ms clock, 0.34+0.44/3.7/7.6+0.18 ms cpu, 877->878->856 MB, 879 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.095070e+000 (scan 3 MB in 879->881 MB) workers=1++0.000000e+000
pacer: 28% CPU (25 exp.) for 2600392+15024+488544 B work (3035336 B exp.) in 922007824 B -> 922351632 B (∆goal -2428012, cons/mark +3.051929e-001)
gc 124 @14.905s 1%: 0.075+5.2+0.058 ms clock, 0.30+0.71/5.1/3.8+0.23 ms cpu, 879->879->857 MB, 881 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.095246e+000 (scan 3 MB in 881->883 MB) workers=1++0.000000e+000
pacer: 25% CPU (25 exp.) for 2668848+21304+488544 B work (3103960 B exp.) in 923814224 B -> 924136768 B (∆goal -2511486, cons/mark +3.051929e-001)
gc 125 @15.199s 1%: 0.083+9.1+0.10 ms clock, 0.33+0.12/8.9/8.3+0.42 ms cpu, 881->881->859 MB, 883 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.093189e+000 (scan 3 MB in 882->884 MB) workers=1++0.000000e+000
pacer: 25% CPU (25 exp.) for 2748424+15720+488544 B work (3178696 B exp.) in 924900032 B -> 925290176 B (∆goal -2517584, cons/mark +3.051929e-001)
gc 126 @15.429s 1%: 0.10+5.7+0.064 ms clock, 0.41+0.16/4.8/5.7+0.25 ms cpu, 882->882->860 MB, 884 MB goal, 0 MB stacks, 0 MB globals, 4 P
pacer: assist ratio=+1.094132e+000 (scan 3 MB in 883->886 MB) workers=1++0.000000e+000
pacer: 25% CPU (25 exp.) for 2930792+15736+488544 B work (3252688 B exp.) in 926862544 B -> 927727544 B (∆goal -2107847, cons/mark +3.051929e-001)

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 23, 2024
@MikeMitchellWebDev
Copy link
Contributor Author

@seankhliao According to Github, it seems like the WaitingForInfo label was added at the same time as I provided the requested info.

@mknyszek
Copy link
Contributor

Given the additional logs, I am skeptical that this is a GC bug. My current hypothesis is that your application is making a single ~500 MiB allocation (or maybe a handful of very large, ~100 MiB allocations concurrently). It does appear that the GC thinks your live heap is actually 856 MiB in size. If this was some kind of strange overrun bug, I would expect the following GC to realize that the vast majority of that memory isn't actually needed and drop it immediately. From that perspective, GOGC is completely working as intended. The jump appears to happen precisely because your application just needs that much more memory, period. The single large allocation would also explain why the heap doesn't grow smoothly and instead simply jumps.

As a result, I suspect there isn't actually anything actionable here. I'd recommend taking a look at a heap profile (the inuse_space sample_index) to confirm this, but I think I'm going to leave the WaitingForInfo label up so that the issue automatically closes out unless we can either rule this out, or we get some more corroborating evidence that this is actually a runtime issue.

@MikeMitchellWebDev
Copy link
Contributor Author

april-27-gogc=3-memheap-pprof001
pprof001
I took a profile. Here are the inuse_space and alloc_space

@mknyszek
Copy link
Contributor

That profile appears to have been taken at the end of program execution, since the inuse_space profile has very little in it. You'd need to acquire a profile when the heap actually grows.

@MikeMitchellWebDev
Copy link
Contributor Author

inuse_profile_may_24
Here you go. I think it just confirms everything you've said thus far. The dramatic spike occurs at the beginning of the application.

gc 119 @3.304s 1%: 0.097+1.9+0.006 ms clock, 0.38+0.095/1.8/2.8+0.026 ms cpu, 350->351->343 MB, 352 MB goal, 0 MB stacks, 0 MB globals, 4 P
gc 120 @3.362s 1%: 0.062+1.6+0.018 ms clock, 0.24+0.092/1.5/3.0+0.072 ms cpu, 350->352->343 MB, 353 MB goal, 0 MB stacks, 0 MB globals, 4 P
gc 121 @3.419s 1%: 0.11+2.3+0.005 ms clock, 0.44+0.37/2.1/3.1+0.021 ms cpu, 351->351->343 MB, 354 MB goal, 0 MB stacks, 0 MB globals, 4 P
gc 122 @3.451s 1%: 0.091+1.7+0.017 ms clock, 0.36+0.096/1.5/2.6+0.069 ms cpu, 859->859->853 MB, 859 MB goal, 0 MB stacks, 0 MB globals, 4 P
gc 123 @8.516s 0%: 0.32+6.5+0.023 ms clock, 1.2+1.6/4.3/8.1+0.093 ms cpu, 875->876->856 MB, 879 MB goal, 0 MB stacks, 0 MB globals, 4 P
gc 124 @8.744s 0%: 0.12+20+0.006 ms clock, 0.49+1.6/19/16+0.025 ms cpu, 876->876->857 MB, 881 MB goal, 0 MB stacks, 0 MB globals, 4 P

@seankhliao seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 15, 2024
@seankhliao
Copy link
Member

sounds like gc is working correctly.

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
Development

No branches or pull requests

4 participants