Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: panic in select when dereferencing channel's element type #23114

Closed
Stebalien opened this issue Dec 13, 2017 · 7 comments

Comments

Projects
None yet
6 participants
@Stebalien
Copy link

commented Dec 13, 2017

What version of Go are you using (go version)?

1.9.2

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/steb/projects/go"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/run/user/1000/tmp/go-build277302015=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

A user ran our program, go-ipfs, and it crashed. Unfortunately, reproducing this bug will be really difficult (unless the reporter somehow has a machine that reproduces this bug frequently).

What did you expect to see?

No crash.

What did you see instead?

A panic: https://gist.github.com/myf/3977d92f4459ac5f2e2f5ffa26495176 (yes, we have a lot of go routines; we're working on it...).

Relevant Stack:

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x17 pc=0x55d286447f83]

goroutine 4168 [running]:
runtime.throw(0x55d286ca8b9a, 0x2a)
	/usr/lib/go/src/runtime/panic.go:605 +0x97 fp=0xc4215af510 sp=0xc4215af4f0 pc=0x55d286462287
runtime.sigpanic()
	/usr/lib/go/src/runtime/signal_unix.go:351 +0x2bc fp=0xc4215af560 sp=0xc4215af510 pc=0x55d28647a30c
runtime.typedmemmove(0x0, 0xc4215af850, 0x1)
	/usr/lib/go/src/runtime/mbarrier.go:242 +0x13 fp=0xc4215af598 sp=0xc4215af560 pc=0x55d286447f83
runtime.selectgo(0xc4215af8c0, 0xc421569980)
	/usr/lib/go/src/runtime/select.go:557 +0x76c fp=0xc4215af810 sp=0xc4215af598 pc=0x55d2864749bc
gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux.(*Session).waitForSendErr(0xc4200a5550, 0xc42188c880, 0xc, 0xc, 0x55d2879175e0, 0xc421654d50, 0xc42188c880, 0x0, 0x0)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux/session.go:338 +0x3cf fp=0xc4215afa40 sp=0xc4215af810 pc=0x55d286863c1f
gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux.(*Stream).write(0xc421890340, 0xc421604810, 0x1, 0x10, 0x13, 0xc420010e00, 0x7fde27ec6d90)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux/stream.go:189 +0x41e fp=0xc4215afb68 sp=0xc4215afa40 pc=0x55d286866dce
gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux.(*Stream).Write(0xc421890340, 0xc421604810, 0x1, 0x10, 0x0, 0x0, 0x0)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux/stream.go:145 +0xe7 fp=0xc4215afbc0 sp=0xc4215afb68 pc=0x55d286866917
gx/ipfs/QmXZYkfBN1cABhBZRaEwLzgEB5B3nAGiJYCmhWbDW3cDus/go-peerstream.(*Stream).Write(0xc42188aaa0, 0xc421604810, 0x1, 0x10, 0x0, 0x7fde27e24db0, 0xc420010e00)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmXZYkfBN1cABhBZRaEwLzgEB5B3nAGiJYCmhWbDW3cDus/go-peerstream/stream.go:86 +0x53 fp=0xc4215afc08 sp=0xc4215afbc0 pc=0x55d28688e613
gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm.(*Stream).Write(0xc42188aaa0, 0xc421604810, 0x1, 0x10, 0xc4215afca0, 0x55d28647b389, 0x10)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm/swarm_stream.go:37 +0x4b fp=0xc4215afc50 sp=0xc4215afc08 pc=0x55d2868a52ab
gx/ipfs/QmQbh3Rb7KM37As3vkHYnEFnzkVXNCP8EYGtHz6g2fXk14/go-libp2p-metrics/stream.(*meteredStream).Write(0xc421835540, 0xc421604810, 0x1, 0x10, 0x1, 0x10, 0xc4215d8f80)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmQbh3Rb7KM37As3vkHYnEFnzkVXNCP8EYGtHz6g2fXk14/go-libp2p-metrics/stream/metered.go:46 +0x58 fp=0xc4215afcb0 sp=0xc4215afc50 pc=0x55d286840b28
gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream.writeUvarint(0x7fde27dc3668, 0xc421835540, 0x13, 0x55d287371ba0, 0x55d287407360)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream/multistream.go:36 +0xb2 fp=0xc4215afd08 sp=0xc4215afcb0 pc=0x55d286827692
gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream.delimWrite(0x7fde27dc3668, 0xc421835540, 0xc4215d8f80, 0x12, 0x20, 0x20, 0x55d286840967)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream/multistream.go:54 +0x49 fp=0xc4215afd50 sp=0xc4215afd08 pc=0x55d286827929
gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream.handshake(0x7fde27dc3580, 0xc421835540, 0x55d287386120, 0x55d287407360)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream/client.go:48 +0x1bc fp=0xc4215afdb8 sp=0xc4215afd50 pc=0x55d2868267dc
gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream.SelectProtoOrFail(0x55d286c894d3, 0xe, 0x7fde27dc3580, 0xc421835540, 0xc421835540, 0xc421835540)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmTnsezaB1wWNRHeHnYrm8K4d5i9wtyj3GsqjC3Rt5b5v5/go-multistream/client.go:11 +0x3b fp=0xc4215afdf8 sp=0xc4215afdb8 pc=0x55d28682639b
gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/protocol/identify.(*IDService).IdentifyConn(0xc420061aa0, 0x55d28792af00, 0xc42178f880)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/protocol/identify/id.go:104 +0x5e3 fp=0xc4215aff30 sp=0xc4215afdf8 pc=0x55d286841e93
gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/host/basic.(*BasicHost).newConnHandler(0xc42015e500, 0x55d28792af00, 0xc42178f880)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/host/basic/basic_host.go:232 +0xb1 fp=0xc4215aff90 sp=0xc4215aff30 pc=0x55d28690cfc1
gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/host/basic.(*BasicHost).(gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/host/basic.newConnHandler)-fm(0x55d28792af00, 0xc42178f880)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmefgzMbKZYsmHFkLqxgaTBG9ypeEjrdWRD5WXH4j1cWDL/go-libp2p/p2p/host/basic/basic_host.go:178 +0x40 fp=0xc4215affb8 sp=0xc4215aff90 pc=0x55d2869120e0
gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm.(*Network).SetConnHandler.func1(0xc42178f880)
	/build/go-ipfs/src/.gopath/src/gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm/swarm_net.go:160 +0x3d fp=0xc4215affd8 sp=0xc4215affb8 pc=0x55d2868a5bed
runtime.goexit()
	/usr/lib/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc4215affe0 sp=0xc4215affd8 pc=0x55d2864948a1
created by gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm.(*Swarm).SetConnHandler.func2
/build/go-ipfs/src/.gopath/src/gx/ipfs/QmdQFrFnPrKRQtpeHKjZ3cVNwxmGKKS2TvhJTuN9C9yduh/go-libp2p-swarm/swarm.go:260 +0x69

Original Report: ipfs/go-ipfs#4483

Code that produced the error: https://ipfs.io/ipfs/QmVxf27kucSvCLiCq6dAXjDU2WG3xZN9ae7Ny6osroP28u/yamux/session.go

Project: https://github.com/ipfs/go-ipfs/

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Dec 13, 2017

@ianlancetaylor ianlancetaylor changed the title Panic in select when dereferencing channel's element type runtime: panic in select when dereferencing channel's element type Dec 13, 2017

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Dec 13, 2017

@Stebalien

This comment has been minimized.

Copy link
Author

commented Dec 13, 2017

@davecheney Hm. I haven't done that in a while (we can't run our tests with it on due to the number of go routines we spin up). There are a few. They don't look like they'd be related but... well, that doesn't necessarily mean anything... I'll try to fix them and see if we can repro after.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Dec 13, 2017

The arguments to typedmemmove look very strange.
The element type of the move is nil, which should never happen.
The pointer to the move source is 1, which is not a valid pointer.
Both of those will cause a segfault, it just happens to hit the first one first.

I can't see any way the type could be nil. A channel is initialized with a known non-nil type at construction time and never changed after that. Similarly for the move source, it is just a pointer arithmetic offset from either the channel pointer itself (which is known non-nil), or the result of a successful newarray call.

TL;DR There's some serious corruption here. As Dave suggested, make sure there are no data races. Also audit the use of unsafe in your packages and dependent packages. Beyond that, I'm not sure what we could do on our end to debug without having a way to reproduce the error.

@Stebalien

This comment has been minimized.

Copy link
Author

commented Dec 13, 2017

After looking into the data races, I'm quite sure they're unrelated (nothing that could lead to out of bound writes). If anything, it some use of unsafe (I suspect either gogo-protobuf or leveldb).

@bradfitz

This comment has been minimized.

Copy link
Member

commented Dec 13, 2017

After looking into the data races, I'm quite sure they're unrelated

The Go runtime makes no formal promises about the behavior of the runtime in the presence of data races. That is, there are officially no "harmless" or "unrelated" data races.

At least, we're generally not going to help debug until known data races are removed. Much of the time, fixing the data race fixes the crash.

@Stebalien

This comment has been minimized.

Copy link
Author

commented Dec 14, 2017

Totally reasonable.

@bradfitz bradfitz modified the milestones: Go1.10, Unplanned Dec 14, 2017

@gopherbot

This comment has been minimized.

Copy link

commented Jan 14, 2018

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this Jan 14, 2018

@golang golang locked and limited conversation to collaborators Jan 14, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.