Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZooKeeper Test & 2013 ZooKeeper Analysis #399

Closed
insumity opened this issue Jul 23, 2019 · 3 comments

Comments

@insumity
Copy link

commented Jul 23, 2019

I had some questions regarding the Jepsen ZooKeeper (ZK) test.

Question 1


In the analysis of 2013 it is written that

In addition, the linearizability property means that all clients will see all updates in the same order–although clients may drift behind the primary by an arbitrary duration.

This seems like a weird statement. Linearizability would imply that when clients want to read something, the clients see the most recent updates. Therefore, if clients drift behind by an arbitrary duration, how can this be a linearizable property? Maybe, you were referring to the fact that writes are linearizable? But what does it mean for just the writes to be linearizable? A history with only writes is clearly linearizable. Or, maybe you were referring to the clients of ZooKeeper and implying that if a client does not contact a ZK server, then the client might contain stale data at a specific point in time?

Also, it is written in the analysis:

Also keep in mind that linearizable state in Zookeeper [...]

What does it mean for a state to be linearizable? This seems like a weird statement as well, especially in regards to what I state in the following question.

Question 2


It is well-known that ZooKeeper does not provide linearizable reads and there is no way for ZooKeeper to provide such reads. Even the use of the sync operation before performing a read does not guarantee linearizable reads, as the following snippet taken from the ZooKeeper book states:

There is a caveat to the use of sync, which is fairly technical and deeply entwined with ZooKeeper internals. (Feel free to skip it.) Because ZooKeeper is supposed to serve reads fast and scale for read-dominated workloads, the implementation of sync has been simplified and it doesn't really traverse the execution pipeline as a regular update operation, like create, setData, or delete. It simply reaches the leader, and the leader queues a response back to the follower that sent it. There is a small chance that the leader thinks that it is the leader l, but doesn't have support from a quorum any longer because the quorum now supports a different leader, lʹ . In this case, the leader l might not have all updates that have been processed, and the sync call might not be able to honor its guarantee.

So, ZooKeeper is not a linearizable system. Nevertheless, the Jepsen test presented here seems to test for linearizability using the knossos verifier.

Why check for linearizability when the system ZK is not linearizable? Were you trying to find linearizability issues first and then see if they were more serious?

Was this Jepsen test, the same one you used for your analysis?
If yes, then why is this test considered a successful test? You did not find any linearizability violations, although ZK clearly does not provide linearizability.

P.S. Thanks for all your work all these years on Jepsen and all the lovely analyses.

@aphyr

This comment has been minimized.

Copy link
Collaborator

commented Jul 23, 2019

What does it mean for a state to be linearizable?

This may make more sense given...

It is well-known that ZooKeeper does not provide linearizable reads and there is no way for ZooKeeper to provide such reads.

I did not, in fact, know this; at time of this writing (2013) my understanding was that all writes were linearizable, and reads could be promoted to linearizability via sync. You're the first person I've talked to about ZK to correct my mistake! Thank you!

I based my understanding on discussions with ZK maintainers, and on the ZAB paper, which makes an argument regarding the use of sync, on page 5, which seemed to suggest linearizability. I was evidently mistaken, and in fact, a careful re-reading of the paper seems to leave the possibility of stale reads open.

If ZK isn't linearizable, it's likely the case that the tests I designed for ZK just weren't stressful enough to observe linearizability violations--it could be, for instance, that they're sensitive to the exact timing of leader election and shutdown. This was a nights-and-weekends project with a tight schedule, and I didn't have much time to go in depth.

Was this Jepsen test, the same one you used for your analysis?

I think so, though it looks like other people have made some commits since.

@insumity

This comment has been minimized.

Copy link
Author

commented Jul 23, 2019

That was an extremely fast response! Thanks.

I guess as well that the exact timing conditions for a linearizability violation to appear are pretty narrow.

@insumity insumity closed this Jul 23, 2019

@aphyr

This comment has been minimized.

Copy link
Collaborator

commented Jul 23, 2019

Yeah! I mean, I find a lot of linearizability errors in various databases, but this was also my very first time doing this kind of test, and it varies from system to system. Could have easily slipped through the cracks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.