New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible loss of all inserts after network partition #2152
Comments
@aphyr : I am getting the following error sometimes when running
|
This issue might be due to readTs not moving forward on leader changes, should be fixed in #2261. |
Hi @janardhan1993! Tracked down this bug too; was another issue in this test I introduced while working on a different dgraph test. Fixed now! |
I can confirm that v1.0.4 (807976c) from 2018-03-22 14:55:24 +1100 still loses inserted records in this test--here's another example: 20180327T135250.000-0500.zip Also ran into a new sort of issue where a node got stuck forever during this test: #2273. |
It also looks like I can lock the system into some weird state where a key's timestamp is skewed far into the future, and it'll refuse to service reads for hours; you get a slow sequence of failures like:
The read timestamps do advance, but they only increment on reads, so I think it's gonna be quite a while before it catches up. It's also odd that this is happening with fresh clients each time; figured those could start anywhere in the timeline... |
@aphyr I will get those changes in, i haven't merged the fixes yet. Will update once done. |
If you are testing please use https://transfer.sh/10WC2g/dgraph-linux-amd64.tar.gz |
Sure thing. :) |
Looks like we're still getting lockups ala #2273 with this build. :( |
On the plus side, though, I'm not seeing lost inserts any more! That's good! |
On the current Dgraph nightly (2018/02/21), it appears that network partitions may cause the loss (or indefinitely-deferred visibility) of all acknowledged inserts after some critical event. This may be related to #2159: mutate transactions may conflict indefinitely after partitions.
We use the following schema:
And perform inserts of single records like
{type: "element", :value 123}
, where values are sequential integers, during simple majority/minority network partitions with a 2-seconds-on/2-seconds-off schedule. We then heal the network and allow ten minutes for the cluster to recover, followed by a query for all elements by each client:These reads can fail to return all acknowledged documents after some critical time:
In this case, 41 acknowledged writes from 0 to 284 were found in the final read, but 127 acknowledged writes from 140 to 2275 were not found. The inserted records were assigned UIDs, so I don't think it's a return-code issue. I think this is, strictly speaking, legal under SI; snapshot isolation allows arbitrarily stale reads, and I thiiiink that should apply to index reads as well.
This behavior is accompanied by a 100% CPU jump on one node in the cluster for several minutes, sometimes beginning when the network heals, sometimes not until a read. Possibly associated with a tablet handoff? pprof.dgraph.samples.cpu.001.pb.gz
You can reproduce this behavior with Jepsen 2371d2ac8c0732dfa9368e3f54eba1a2c5cf3895, using the nightlies (or perhaps 1.0.3 or 1.0.4), by running
One UID, no indices
This behavior also manifests without indices. We use this schema:
... and insert elements by associating them all with the same UID, e.g.
... and so on. We read all inserted values with
{ q(func: uid(0x01)) { uid, value } }
. When partitions occur, the same problem can arise: all successfully inserted tuples after some point are missing from the final read. This is present on v1.0.3-dev (5563bd2), and can be reproduced with Jepsen ce4d1e839f53a1146cb81c3b3d4e247d1e490d7c by runningThe text was updated successfully, but these errors were encountered: