Skip to content

runtime.malg: memory leak #66564

Closed
Closed
@cdvr1993

Description

@cdvr1993

Go version

go1.22

Output of go env in your module/workspace:

GO111MODULE='off'
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/user/.cache/go-build'
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT='nocoverageredesign'
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/user/go/go-code-sparse/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/go/go-code-sparse'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/user/.cache/bazel/_bazel_cdvr/0afdc4359d915c9cfae96806c9f27c6f/external/go_sdk'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/user/.cache/bazel/_bazel_cdvr/0afdc4359d915c9cfae96806c9f27c6f/external/go_sdk/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.1 X:nocoverageredesign'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD=''
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4070154476=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Recently we found on our production servers profiles considerable high memory utilization (50MB) for goroutines, but we couldn't find anything in the goroutine profiles. After a while we discovered this is because there was an spike of goroutines several hours before.

We replicated the scenario with the following code:

package main

import (
        "fmt"
        "net/http"
        _ "net/http/pprof"
        "runtime"
        "sync"
        "time"
)

func main() {
        // Start pprof server
        go func() {
                http.ListenAndServe("localhost:8080", nil)
        }()

        // Create a WaitGroup to wait for all goroutines to finish
        var wg sync.WaitGroup

        // Start half a million goroutines
        size := 100_000
        wg.Add(size)
        for i := 0; i < size; i++ {
                go func() {
                        compute()
                        wg.Done()
                        wg.Wait()
                }()
        }

        // Wait for all goroutines to finish
        wg.Wait()

        fmt.Println("Done")

        // Run GC every second forever
        for {
                runtime.GC()
                time.Sleep(time.Second)
        }
}

// compute performs some simple computation
func compute() {
        for i := 0; i < 1000; i++ {
                _ = i * i
        }
}

I ran it like:

GOMAXPROCS=2 GODEBUG=gctrace=1 go run /tmp/main.go

What did you see happen?

GC trace
gc 512 @521.755s 0%: 0.020+25+0.002 ms clock, 0.041+0/12/23+0.005 ms cpu, 43->43->43 MB, 88 MB goal, 0 MB stacks, 0 MB globals, 2 P (forced)

Heap
Screenshot 2024-03-27 at 10 34 30 AM
Screenshot 2024-03-27 at 10 34 50 AM

Goroutines
Screenshot 2024-03-27 at 10 26 38 AM

It seems the issue comes from:

https://sourcegraph.com/github.com/golang/go@go1.22.1/-/blob/src/runtime/mgcmark.go?L316-318

I see that we do cleanup for stacks for Gs with stacks, but we never cleanup Gs without stacks, we just keep appending:

	lock(&sched.gFree.lock)
	sched.gFree.noStack.pushAll(q)
	unlock(&sched.gFree.lock)

What did you expect to see?

I'm not sure if this is a known issue or an intended behavior, but the only way for us to clear up that memory is by restarting.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions