Yet more missing values with move-tablet #4575
Labels
area/testing/jepsen
kind/bug
Something is broken.
status/accepted
We accept to investigate/work on it.
aphyr commentedJan 14, 2020
•
edited
What version of Dgraph are you using?
1.1.1-56-ge18986f1c
Have you tried reproducing the issue with the latest release?
1.1.1-56-ge189861fc is yesterday's dev build; I haven't started with today's yet.
What is the hardware spec (RAM, OS)?
5-node LXC Jepsen cluster, on a 48-way Xeon, 128GB of ram.
Steps to reproduce the issue (command/config used to run Dgraph).
This appears to be an infrequent error. Spurious nulls were extremely common in 1.1.1 and subsequent development builds, but reproducing it with ge189 appears more difficult. With Jepsen b7c7bdc5f8476e009d591927060e7e8e786f015a, try:
Expected behaviour and actual result.
In this test run, Dgraph returned a single, transient nil value in response to a read of all accounts during what looks like a tablet move. I'm wondering if there could be a rare race condition here? Something that slips through before the move-ts checks kick in and prevent reads from executing on a group that doesn't host a tablet?
The text was updated successfully, but these errors were encountered: