You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While trying to fix a bug in the tests for #2152, I managed to lock up dgraph into a state where every request to one node (n1) timed out, despite the process running on all nodes. Here's the complete logs and data file from all five nodes.
and involves a series of network partitions; it looks as if the final partition healing might have left n1 in a state where it believed the leader was... possibly a node which was not the leader?
2018/03/27 12:22:51 node.go:344: Error while sending message to node with addr: n2:7080, err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
2018/03/27 12:22:51 groups.go:702: Error in oracle delta stream. Error: rpc error: code = Unknown desc = Node is no longer leader.
2018/03/27 12:22:51 Error while retrieving timestamps: rpc error: code = Unknown desc = Assigning IDs is only allowed on leader.. Will retry...
To reproduce this, run with Jepsen e31b29d1a5302766c2c83454eeed9124ef9820f5:
lein run test --package-url https://transfer.sh/TjHBo/dgraph-linux-amd64.tar.gz --force-download -w set --time-limit 300 --concurrency 2n --nemesis partition-random-halves
This appears to be a semi-rare fault; I've only seen it once so far.
The text was updated successfully, but these errors were encountered:
While trying to fix a bug in the tests for #2152, I managed to lock up dgraph into a state where every request to one node (n1) timed out, despite the process running on all nodes. Here's the complete logs and data file from all five nodes.
dgraph-n1-lockup.zip
This occurs on
Dgraph version : v1.0.4
Commit SHA-1 : 807976c
Commit timestamp : 2018-03-22 14:55:24 +1100
Branch : master
and involves a series of network partitions; it looks as if the final partition healing might have left n1 in a state where it believed the leader was... possibly a node which was not the leader?
To reproduce this, run with Jepsen e31b29d1a5302766c2c83454eeed9124ef9820f5:
This appears to be a semi-rare fault; I've only seen it once so far.
The text was updated successfully, but these errors were encountered: