Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gccgo: stack memory leak if create large number of goroutines in short time #16271

Open
fasheng opened this issue Jul 5, 2016 · 10 comments

Comments

@fasheng
Copy link

commented Jul 5, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
$ go version
go version go1.6.1 gccgo (GCC) 6.1.1 20160602 linux/amd64
  1. What operating system and processor architecture are you using (go env)?

Its gcc-go 6.1.1.-2 under Arch Linux.

$ uname -a
Linux fa-host 4.6.3-1-ARCH #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 x86_64 GNU/Linux

$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/usr/lib/go:/usr/share/gocode:/usr/lib/go/site"
GORACE=""
GOROOT="/usr"
GOTOOLDIR="/usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1"
GO15VENDOREXPERIMENT="1"
CC="/usr/bin/gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="/usr/bin/g++"
CGO_ENABLED="1"
  1. What did you do?

After several days memory leak debugging work for our go
application, I found the memory issue only occurred when
compiled by gccgo, and its goroutine stack memory related
instead of heap. Here is a simple demo to reproduce it:

package main

import "time"
import "fmt"
import "runtime/debug"

func main() {
    worker := func() {
        time.Sleep(3 * time.Second)
    }
    for i := 0; i < 5000; i++ {
        go worker()
    }
    time.Sleep(10 * time.Second)
    fmt.Println("try to release memory")
    debug.FreeOSMemory()
    time.Sleep(3600 * 12 * time.Second)
}

For example save it as test.go and run:

go build -compiler gccgo test.go
./test
  1. What did you expect to see?

The used memory should be released after "try to release
memory" printed.
5. What did you see instead?

The memory(about 100MB) is still in used and looks never be
released even through I waited for a long time(about 1
hour). So I think it should be a special stack memory leak
issue.

BTW: Everything is OK if compiled by go/gc.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Feb 2, 2018

In typical use goroutine stacks are allocated by libgcc, outside the Go heap, and are attached to a goroutine. If we ever released goroutines we would release the stacks, but we don't. So they stick around forever. I don't know if there is a good way to fix this.

@zhengcai

This comment has been minimized.

Copy link

commented Apr 18, 2018

I wonder, is there any way to reclaim the memory used by the already terminated goroutines?

We are using go-grpc right now, and each RPC request will result in a goroutine created. Because of this problem, our program runs out of memory pretty quickly if there are a lot of RPC requests. Is there any way to force gcc-go to release the goroutines? Thanks

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2018

@zhengcai The state is unchanged: there is no way to reclaim the memory used by goroutines. But goroutines are reused, so I'm not sure how you could run out of memory even with a lot of goroutines. What processor? What OS? What linker?

@amandeepgautam

This comment has been minimized.

Copy link

commented Aug 23, 2018

@ianlancetaylor will there by any work on this in near future? Are there any proposals to fix this?

I am using grpc with gccgo 8.2 on AIX. The problem we are facing is that grpc spaws a new goroutine for each request it gets. Now, as many as 100 requests can arrive at a time and sometimes even more. So, with such a large number of requests, the memory reaches up to 2~3 GB if the software is running for 3-4 days.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

I do not know of anybody planning to work on this at present. Do you want to take a look?

@amandeepgautam

This comment has been minimized.

Copy link

commented Aug 23, 2018

@ianlancetaylor sure would love to. But this would be the first time I would be working on something like this so would need guidance. Let me know if that could work.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

Thanks. I'm always happy to answer questions.

@amandeepgautam

This comment has been minimized.

Copy link

commented Aug 23, 2018

@ianlancetaylor what should be the files I should start looking at? I am not at all familiar with the gcc code. Are there any other places where we employ garbage collection strategies?

I guess the first step is to put forward a proposal, right?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

You don't need to put forward a formal proposal, just a sketch of what you want to do.

As I understand it (and I may not) the complaint here is that a program may sometimes create a large number of goroutines, which can consume a lot of memory, and you want some mechanism to make those goroutines go away and thus reclaim the memory. To be clear, if that number of goroutines is required again, memory use will go up again. So all we are talking about here is some way to discard unused goroutines. Right?

If that is the problem then the solution is going to be something along the lines of having the sysmon function periodically look at the sched.gfree list, and decide to release some of the goroutines on that list. Then we will need to make sure that when we release a goroutine, we also free its stack.

@amandeepgautam

This comment has been minimized.

Copy link

commented Aug 23, 2018

Yes, the idea is to discard unused goroutines. I will try to find some time and start looking at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.