New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot isolation violation with server-side sequencing #2391

Open
aphyr opened this Issue May 16, 2018 · 5 comments

Comments

Projects
None yet
3 participants
@aphyr

aphyr commented May 16, 2018

On Dgraph 1.0.5-dev (7796a40), overlapping-ring network partitions may occasionally result in read skew, despite using both server-side sequencing and @upsert directives. This occurred on a five-node cluster, where all five nodes run both zero and alpha, and with replica count five. This problem is currently difficult to reproduce, but since it occurred in a single-group cluster where tablet moves are impossible, I believe it's a separate bug from #2321.

I have only two cases of this so far: 20180511T185232.000-0500.zip, and 20180516T095744.000-0500.zip
/20180516T095744.000-0500.zip).

The first is more dramatic: a single node, n1, has read transactions which observe up to 30% lower totals across all accounts for tens of seconds. 95 out of 1346 total reads violated snapshot isolation. The first invalid read begins six seconds after the resolution of a network partition, and before the next partition began; I suspect there may be some relationship with either a zero or alpha topology change, but without more data it's hard to infer more.

bank

The second case had n3 jump from 100 to 102 for a brief period.

bank

The good news is that unlike earlier SI violations, these appear to be limited to single nodes, and don't result in values drifting over time, so it may only affect read safety and not updates.

@aphyr

This comment has been minimized.

Show comment
Hide comment
@aphyr

aphyr May 16, 2018

Snagged a third case; also limited to a single node. Nothing obvious in the logs yet, though. Looks like n3 went wonky somehow. 20180516T135225.000-0500 (1).zip. There's a long period of "ignored a MsgAppResp message with lower term" messages on n3, but that also occurred on n2, so it might be unrelated.

bank 1

aphyr commented May 16, 2018

Snagged a third case; also limited to a single node. Nothing obvious in the logs yet, though. Looks like n3 went wonky somehow. 20180516T135225.000-0500 (1).zip. There's a long period of "ignored a MsgAppResp message with lower term" messages on n3, but that also occurred on n2, so it might be unrelated.

bank 1

@manishrjain manishrjain added the bug label May 16, 2018

@manishrjain

This comment has been minimized.

Show comment
Hide comment
@manishrjain

manishrjain May 16, 2018

Member

I have a hunch for what could be happening here. In particular, the lin read server-side sequencing should probably occur after waiting for Zero Transaction stream to catch up.

I'd like to try it out, but if this happens very rarely, it would be hard to test.

Member

manishrjain commented May 16, 2018

I have a hunch for what could be happening here. In particular, the lin read server-side sequencing should probably occur after waiting for Zero Transaction stream to catch up.

I'd like to try it out, but if this happens very rarely, it would be hard to test.

@manishrjain

This comment has been minimized.

Show comment
Hide comment
@manishrjain

manishrjain May 18, 2018

Member

Running tests to see if this is still a bug.

Set --test-count 20 --time-limit 300 --replicas 5, dgraph unknown bank s=server upsert nemesis=partition-ring.

Member

manishrjain commented May 18, 2018

Running tests to see if this is still a bug.

Set --test-count 20 --time-limit 300 --replicas 5, dgraph unknown bank s=server upsert nemesis=partition-ring.

@aphyr

This comment has been minimized.

Show comment
Hide comment
@aphyr

aphyr May 23, 2018

We've added code to disable predicate moves entirely, and found that even in healthy clusters with no predicate moves, Dgraph can occasionally return incorrect balances, or nil values for predicates. This is on v1.0.5-dev, 7796a40: 20180522T232334.000-0500 (1).zip

bank 1

You can reproduce this with Jepsen d87abed0561621e8f94ce1d6659ce7488b3fcd31

lein run test --package-url https://github.com/dgraph-io/dgraph/releases/download/nightly/dgraph-linux-amd64.tar.gz --force-download --nemesis none --rebalance-interval 10h --sequencing server --upsert-schema --time-limit 600 --concurrency 1n --workload delete --retry-db-setup --test-count 20

aphyr commented May 23, 2018

We've added code to disable predicate moves entirely, and found that even in healthy clusters with no predicate moves, Dgraph can occasionally return incorrect balances, or nil values for predicates. This is on v1.0.5-dev, 7796a40: 20180522T232334.000-0500 (1).zip

bank 1

You can reproduce this with Jepsen d87abed0561621e8f94ce1d6659ce7488b3fcd31

lein run test --package-url https://github.com/dgraph-io/dgraph/releases/download/nightly/dgraph-linux-amd64.tar.gz --force-download --nemesis none --rebalance-interval 10h --sequencing server --upsert-schema --time-limit 600 --concurrency 1n --workload delete --retry-db-setup --test-count 20
@mkcp

This comment has been minimized.

Show comment
Hide comment
@mkcp

mkcp Sep 5, 2018

We've been running additional bank tests on healthy clusters (no nemesis-induced network partitions) against v1.0.7, v1.0.8-rc1, and v1.0.8 and have not had any failing test cases in these versions. We still get incorrect balances, resulting in failing tests, when we partition the network. Progress! 🎉

mkcp commented Sep 5, 2018

We've been running additional bank tests on healthy clusters (no nemesis-induced network partitions) against v1.0.7, v1.0.8-rc1, and v1.0.8 and have not had any failing test cases in these versions. We still get incorrect balances, resulting in failing tests, when we partition the network. Progress! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment