Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CGO keeps memory increasing in a goroutine and got killed by OOM. #53440

Closed
hwiorn opened this issue Jun 18, 2022 · 4 comments
Closed

CGO keeps memory increasing in a goroutine and got killed by OOM. #53440

hwiorn opened this issue Jun 18, 2022 · 4 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@hwiorn
Copy link

hwiorn commented Jun 18, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18.3 linux/amd64                                                                                       │

I tested 1.17, 1.18, 1.19beta1 and 1.18.3(currently latest) already.

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

This issue happens a PopOS 22.04 with go-1.18.3 and a Ubuntu 22.04 docker container with go-1.18.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Pop
Description:    Pop!_OS 22.04 LTS
Release:        22.04
Codename:       jammy
$ docker images | grep ubuntu | grep 22.04
ubuntu        22.04                 5ccefbfc0416   3 months ago    78MB
go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/gglee/.cache/go-build"
GOENV="/home/gglee/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/gglee/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/gglee/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3489892870=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I use gRPC in go server and call deep learning inference using CGO. I already tested the DL library and C++ TCP server and there isn't any memory leak. This issue is happened when memory allocation with CGO in a goroutine.

I found a similar issue in stackoverflow.

I don't know this issue is a bug or not. This phenomenon is really unclear to me. I have been trying to solve this issue. But for now, I feel only solution is I would change it's implementation to C++ or Rust version.

package main

/*
   #include <stdlib.h>

   void * inferenceWorker1Malloc(long size) {
     return malloc(size);
   }

   void inferenceWorker1Free(void * ptr) {
     free(ptr);
   }
*/
import "C"

import (
	"fmt"
	"os"
	"sync"
	"time"
	"unsafe"
)

const prepareInputCount = 1000
const limitChannels = 10

func inferenceWorker1() {
	ptrs := make([]unsafe.Pointer, prepareInputCount)

	//for j := 0; j < 1000; j++ {
	for {
		// Receive a data from the channel
		time.Sleep(1 * time.Millisecond)

		// Prepare input data and DL decoder
		for i := 0; i < prepareInputCount; i++ {
			//ptrs[i] = C.malloc(16 * (1 << (i % 20))) // OOM killed
			ptrs[i] = C.inferenceWorker1Malloc(16 * (1 << (i % 20))) // OOM killed
		}

		// Inference
		//time.Sleep(time.Duration(rand.Intn(10)+1) * time.Millisecond)
		time.Sleep(1 * time.Millisecond)

		// Remove allocations
		for i := 0; i < prepareInputCount; i++ {
			// C.free(ptrs[i])
			C.inferenceWorker1Free(ptrs[i])
		}
	}
}

func main() {
	var wg sync.WaitGroup

	// Process channel requests
	for i := 0; i < limitChannels; i++ {
		wg.Add(1)

		// FIXME: This goroutine keeps increasing memory until reaching OOM
		go func() {
			defer wg.Done()
			inferenceWorker1()
		}()
	}

	// NOTE: No memory leak without a goroutine
	//inferenceWorker1()

	// NOTE: You can see the memory leak in this step
	wg.Wait()
	b := make([]byte, 1)
	fmt.Println("Press any key ..")
	os.Stdin.Read(b)
}
$ time go run main.go
signal: killed

real    8m19.677s
user    11m58.797s
sys     5m22.965s

Memory usage

while true; do
echo $(LC_ALL=C date) \| $(ps -eo pid,%cpu,rss,%mem,args | grep go-buil[d]) \| $(free -h | grep Mem)
sleep 1
done |& tee memusage-$(date +%Y%m%d_%H%M%S).log

memusage-20220618_211256.log

What did you expect to see?

When cgo call is done, go runtime have to reduce memory usage.

What did you see instead?

Memory keeps increasing and Go process got killed by OOM. Docker container restarts repeatedly under heavy requests.

Similar issues

@ianlancetaylor
Copy link
Contributor

That program allocates a lot of memory. Each iteration of the loop in inferenceWorker allocates 838860000 bytes. You are running 10 of those loops in parallel, so 8388600000 bytes, which is 0x1f3ffe0c0, or 7G. The allocations are different sizes, so there is going to fragmentation in the C heap, driving up memory use. And, of course, the Go heap takes up memory too. So the memory usage reports you see, for 30G, doesn't seem entirely out of line with what the program is doing.

What is your ulimit -v value? If I change the program to always allocate 16 * (1 << 19) bytes to avoid fragmentation and use ulimit -v 20971520 the the program seems to run forever using about 20G of virtual memory space.

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 18, 2022
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@hwiorn
Copy link
Author

hwiorn commented Jul 19, 2022

Sorry for the late reply.

That program allocates a lot of memory. Each iteration of the loop in inferenceWorker allocates 838860000 bytes. You are running 10 of those loops in parallel, so 8388600000 bytes, which is 0x1f3ffe0c0, or 7G. The allocations are different sizes, so there is going to fragmentation in the C heap, driving up memory use. And, of course, the Go heap takes up memory too. So the memory usage reports you see, for 30G, doesn't seem entirely out of line with what the program is doing.

It was intended for seeing memory usage because deploy server has 256G memory.
I had tested C implementation itself, there was no issue.
The phenomenon seemed to be in Go only.

What is your ulimit -v value?

it is unlimited.

If I change the program to always allocate 16 * (1 << 19) bytes to avoid fragmentation and use ulimit -v 20971520 the the program seems to run forever using about 20G of virtual memory space.

In both case(example and my decode server), It got pthread_create failed: Resource temporarily unavailable error after reaching memory limit.

For the record, I solved this memory issue applying two things.

  • Remove C memory allocation in CGO as much as possible.
    • Actually, This doesn't help any memory issue, but reduce high CPU usage.
    • C Memory allocation in CGO causes high CPU utilization and Mem usage.
  • Change glibc(GNU C Allocator) to jemalloc.

@Ricardo-wd
Copy link

I changed glibc(GNU C Allocator) to tcmalloc which solved a similar problem.

@golang golang locked and limited conversation to collaborators Dec 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants