Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: sweep increased allocation count #31006

Open
chathaway-codes opened this issue Mar 22, 2019 · 5 comments
Open

runtime: sweep increased allocation count #31006

chathaway-codes opened this issue Mar 22, 2019 · 5 comments
Milestone

Comments

@chathaway-codes
Copy link

@chathaway-codes chathaway-codes commented Mar 22, 2019

What version of Go are you using (go version)?

We've encountered this issue with go versions:

  • go version go1.10.4 linux/amd64
  • go version go1.11.5 linux/amd64
  • go version go1.12 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

We've seen it on a number of boxes running x86_64 processors and various operating systems. To help debug this, we've reproduced it in a Docker container. Information below is from the Docker container.

go env Output
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GORACE=""
GOROOT="/usr/lib/go-1.10"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.10/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build162370947=/tmp/go-build -gno-record-gcc-switches"

What did you do?

We (YottaDB) have written a Go wrapper for our NoSQL database. The codebase for the database is used in production in many places, the Go wrapper is currently declared "field test grade". We have a test which exercises various methods in our wrapper, and it seems to consistently produce an error message if GOGC is set low enough.

What did you expect to see?

A successful run

What did you see instead?

A crash. The crash was rare, until we found some other bug reports which suggested setting GOGC low to reproduce it.

root@4c74966bf3f4:~/go/src/random_walk# ./random_walk -test.bench=. &> out.txt
root@4c74966bf3f4:~/go/src/random_walk# head out.txt
goos: linux
goarch: amd64
pkg: random_walk
BenchmarkRW1threads10depth5tp-4       	   20000	     89962 ns/op
BenchmarkRW10threads10depth5tp-4      	   20000	    100239 ns/op
runtime: nelems=8 nalloc=4 previous allocCount=3 nfreed=65535
fatal error: sweep increased allocation count

runtime stack:
runtime.throw(0x543fe8, 0x20)

We've also seen errors like this:

goos: linux
goarch: amd64
pkg: random_walk
BenchmarkRW1threads10depth5tp-4       	   20000	     77309 ns/op
BenchmarkRW10threads10depth5tp-4      	   20000	     68258 ns/op
BenchmarkRW100threads10depth5tp-4     	runtime: pointer 0xc42023316c to unallocated span idx=0x119 span.base()=0xc420232000 span.limit=0xc420233f80 span.state=3
fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)

runtime stack:
runtime.throw(0x547467, 0x3e)

We've stashed the test environment in a Docker container which can be pulled with:

docker pull charlescodes/bug-containers:golang_alloc_sweep_bug
docker run -ti charlescodes/bug-containers:golang_alloc_sweep_bug bash
root@a677cf6e1ae5:/data# cd
root@a677cf6e1ae5:~# cd go/src/random_walk/
root@a677cf6e1ae5:~/go/src/random_walk# head out.txt
goos: linux
goarch: amd64
pkg: random_walk
BenchmarkRW1threads10depth5tp-4       	runtime: nelems=8 nalloc=8 previous allocCount=7 nfreed=65535
fatal error: sweep increased allocation count

runtime stack:
runtime.throw(0x543fe8, 0x20)
	/usr/lib/go-1.10/src/runtime/panic.go:616 +0x81
runtime.(*mspan).sweep(0x7f26a90ac7d0, 0xc4200ed000, 0x41c400)

To rerun the test in that Docker container:

source $(pkg-config --variable=prefix yottadb)/ydb_env_set
wget https://gitlab.com/YottaDB/DB/YDBTest/raw/master/go/inref/random_walk_test.go
export GOGC=2
go test -bench=.

The source code for the test program which produces this crash is available at https://gitlab.com/YottaDB/DB/YDBTest/blob/master/go/inref/random_walk_test.go. The YDBGo wrapper (https://gitlab.com/YottaDB/Lang/YDBGo) has CI setup which includes running go test with race, and we manually verified that the test script has no race conditions.

With GODEBUG=gccheckmark=1 set, this doesn't seem to reproduce as often. I don't see anything extra in the output of a crash though.

Thanks in advance for the help!

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Mar 22, 2019

This error is commonly caused by invalid use of unsafe or cgo. Do you use unsafe or cgo?

@bradfitz bradfitz changed the title sweep increased allocation count runtime: sweep increased allocation count Mar 22, 2019
@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Mar 22, 2019

To answer my own question: yes. A fair bit. e.g. https://gitlab.com/YottaDB/Lang/YDBGo/blob/master/buffer_t.go#L51

Have you audited it for correctness?

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Mar 22, 2019

Please try again with GODEBUG=cgocheck=2 per https://golang.org/cmd/cgo/#hdr-Passing_pointers

@chathaway-codes

This comment has been minimized.

Copy link
Author

@chathaway-codes chathaway-codes commented Mar 22, 2019

Yes, we use cgo and unsafe a fair amount. We've audit'd it for correctness as part of our review process, and have a number of tests which pass without issue. Some of those tests are rather long living, and do a great deal of intensive operations against our API.

With your suggestion to use the more strict cgocheck, we get the same result.

root@5c4b6162eba5:~/go/src/random_walk# export GODEBUG=cgocheck=2
root@5c4b6162eba5:~/go/src/random_walk# go test -bench=. &> out.txt
root@5c4b6162eba5:~/go/src/random_walk# head out.txt
goos: linux
goarch: amd64
pkg: random_walk
BenchmarkRW1threads10depth5tp-4       	runtime: nelems=128 nalloc=26 previous allocCount=25 nfreed=65535
fatal error: sweep increased allocation count

runtime stack:
runtime.throw(0x543fc8, 0x20)
	/usr/lib/go-1.10/src/runtime/panic.go:616 +0x81
runtime.(*mspan).sweep(0x7f61d98aed10, 0xc420000000, 0x547e00)

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Mar 22, 2019

@bcmills bcmills added this to the Go1.13 milestone May 14, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.