While trying to fix a bug in the tests for #2152, I managed to lock up dgraph into a state where every request to one node (n1) timed out, despite the process running on all nodes. Here's the complete logs and data file from all five nodes.
and involves a series of network partitions; it looks as if the final partition healing might have left n1 in a state where it believed the leader was... possibly a node which was not the leader?
2018/03/27 12:22:51 node.go:344: Error while sending message to node with addr: n2:7080, err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
2018/03/27 12:22:51 groups.go:702: Error in oracle delta stream. Error: rpc error: code = Unknown desc = Node is no longer leader.
2018/03/27 12:22:51 Error while retrieving timestamps: rpc error: code = Unknown desc = Assigning IDs is only allowed on leader.. Will retry...
To reproduce this, run with Jepsen e31b29d1a5302766c2c83454eeed9124ef9820f5:
lein run test --package-url https://transfer.sh/TjHBo/dgraph-linux-amd64.tar.gz --force-download -w set --time-limit 300 --concurrency 2n --nemesis partition-random-halves
This appears to be a semi-rare fault; I've only seen it once so far.
The text was updated successfully, but these errors were encountered:
While trying to fix a bug in the tests for #2152, I managed to lock up dgraph into a state where every request to one node (n1) timed out, despite the process running on all nodes. Here's the complete logs and data file from all five nodes.
dgraph-n1-lockup.zip
This occurs on
Dgraph version : v1.0.4
Commit SHA-1 : 807976c
Commit timestamp : 2018-03-22 14:55:24 +1100
Branch : master
and involves a series of network partitions; it looks as if the final partition healing might have left n1 in a state where it believed the leader was... possibly a node which was not the leader?
To reproduce this, run with Jepsen e31b29d1a5302766c2c83454eeed9124ef9820f5:
This appears to be a semi-rare fault; I've only seen it once so far.
The text was updated successfully, but these errors were encountered: