Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha segfault in worker.(*groupi).members #2322

aphyr opened this issue Apr 10, 2018 · 0 comments · Fixed by #2336

Alpha segfault in worker.(*groupi).members #2322

aphyr opened this issue Apr 10, 2018 · 0 comments · Fixed by #2336
kind/bug Something is broken.


Copy link

aphyr commented Apr 10, 2018

On version

Dgraph version   : v1.0.4
Commit SHA-1     : 6fb69e2
Commit timestamp : 2018-04-09 21:26:31 +0530
Branch           : jan/node_lockup

With a combination of alpha and zero process crashes, some alpha instances (in this case, n1, with the lowest node index) will repeatedly segfault after startup, throwing:

2018/04/09 14:37:37 groups.go:105: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp getsockopt: connection refused"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x10c3e1d]

goroutine 7498 [running]:*groupi).members(0xc4203b0180, 0x0, 0x0)
	/home/janardhan/go/src/ +0xdd*groupi).AnyServer(0xc4203b0180, 0x0, 0xc42cf83b19)
	/home/janardhan/go/src/ +0x4b*groupi).Tablet(0xc4203b0180, 0xc42cf83b19, 0x3, 0x3)
	/home/janardhan/go/src/ +0xca*groupi).BelongsTo(0xc4203b0180, 0xc42cf83b19, 0x3, 0xe7ea53)
	/home/janardhan/go/src/ +0xb2*grpcWorker).ServeTask(0xc425b4a030, 0x1a8e3e0, 0xc42d22de90, 0xc42598d100, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/ +0xa7, 0xc425b4a030, 0x1a8e3e0, 0xc42d22de90, 0xc42d28c000, 0x0, 0x0, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/ +0x276*Server).processUnaryRPC(0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500, 0xc420216db0, 0x1a77ff8, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/ +0x8f4*Server).handleStream(0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500, 0x0)
	/home/janardhan/go/src/ +0x1528*Server).serveStreams.func1.1(0xc4257806e0, 0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500)
	/home/janardhan/go/src/ +0x9f
created by*Server).serveStreams.func1
	/home/janardhan/go/src/ +0xa1

I'm not entirely sure on the timing here, because the panic message isn't timestamped, but looking at normal alpha executions, it looks like this happens just a few seconds into startup, after several failures to establish a connection to the local Zero.

This occurred in the same scenario as #2321--See node n1's alpha logs here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
kind/bug Something is broken.
None yet

Successfully merging a pull request may close this issue.

3 participants