New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha segfault in worker.(*groupi).members #2322

Closed
aphyr opened this Issue Apr 10, 2018 · 0 comments

Comments

Projects
None yet
3 participants
@aphyr

aphyr commented Apr 10, 2018

On version

Dgraph version   : v1.0.4
Commit SHA-1     : 6fb69e2
Commit timestamp : 2018-04-09 21:26:31 +0530
Branch           : jan/node_lockup

With a combination of alpha and zero process crashes, some alpha instances (in this case, n1, with the lowest node index) will repeatedly segfault after startup, throwing:

2018/04/09 14:37:37 groups.go:105: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 192.168.122.11:5080: getsockopt: connection refused"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x10c3e1d]

goroutine 7498 [running]:
github.com/dgraph-io/dgraph/worker.(*groupi).members(0xc4203b0180, 0x0, 0x0)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/worker/groups.go:391 +0xdd
github.com/dgraph-io/dgraph/worker.(*groupi).AnyServer(0xc4203b0180, 0x0, 0xc42cf83b19)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/worker/groups.go:401 +0x4b
github.com/dgraph-io/dgraph/worker.(*groupi).Tablet(0xc4203b0180, 0xc42cf83b19, 0x3, 0x3)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/worker/groups.go:333 +0xca
github.com/dgraph-io/dgraph/worker.(*groupi).BelongsTo(0xc4203b0180, 0xc42cf83b19, 0x3, 0xe7ea53)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/worker/groups.go:306 +0xb2
github.com/dgraph-io/dgraph/worker.(*grpcWorker).ServeTask(0xc425b4a030, 0x1a8e3e0, 0xc42d22de90, 0xc42598d100, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/worker/task.go:1266 +0xa7
github.com/dgraph-io/dgraph/protos/intern._Worker_ServeTask_Handler(0x1305500, 0xc425b4a030, 0x1a8e3e0, 0xc42d22de90, 0xc42d28c000, 0x0, 0x0, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/github.com/dgraph-io/dgraph/protos/intern/internal.pb.go:2635 +0x276
google.golang.org/grpc.(*Server).processUnaryRPC(0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500, 0xc420216db0, 0x1a77ff8, 0x0, 0x0, 0x0)
	/home/janardhan/go/src/google.golang.org/grpc/server.go:920 +0x8f4
google.golang.org/grpc.(*Server).handleStream(0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500, 0x0)
	/home/janardhan/go/src/google.golang.org/grpc/server.go:1142 +0x1528
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc4257806e0, 0xc4200e2000, 0x1a919e0, 0xc4257c7500, 0xc42d282500)
	/home/janardhan/go/src/google.golang.org/grpc/server.go:637 +0x9f
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/home/janardhan/go/src/google.golang.org/grpc/server.go:635 +0xa1

I'm not entirely sure on the timing here, because the panic message isn't timestamped, but looking at normal alpha executions, it looks like this happens just a few seconds into startup, after several failures to establish a connection to the local Zero.

This occurred in the same scenario as #2321--See node n1's alpha logs here: https://github.com/dgraph-io/dgraph/files/1892343/20180409T163239.000-0500.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment