New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lost inserts in CQL sets and maps with chaotic timestamps #7082
Comments
Hi @aphyr, Thanks for creating the issue. I can see that the final select is done with CL=ALL but what's the CL of the updates (writec)? |
I think the link to this generator is broken. Did you mean noisy-timestamps? |
Just to clarify - there's a single client and a single node Scylla cluster in this test? or are there more nodes in the cluster? |
I can see RF of 3 is used so I would expect at least 3 Scylla nodes |
@aphyr I have a hypothesis that the test might be wrong if any update ends up with timestamp smaller than the timestamp of the initial insert. I expect this can happen because the See the following example:
Note that element |
This occurs with either CL=ALL or CL=ONE.
Ah, yes, thank you. Fixed.
There is a single client, and five nodes.
The code in this commit uses CL=ONE, yes; I've tested with a variety of CLs and all fail.
Yup, five nodes!
I have to admit, I'm somewhat surprised by this, though on further reflection, I think it makes sense. I was expecting (based on the way the test was originally written) that row updates commuted with row creation (which would preserve these updates), or that row updates to a row which doesn't logically exist yet would fail (which would result in client-visible errors). But you're right--the documentation for It sounds like the right way to preserve set insertions is to ensure that one never inserts a row with a set, and instead performs only updates--relying on the fact that @ManManson--you wrote this test initially--does this sound right to you? |
Yes. Dropping the insert will work too. |
I guess we can close the issue then. What do you think, @aphyr ? |
I'm unsure! The documentation for CQL sets steers users towards inserting sets with a CQL |
It depends on the perspective. If the user understands that insert assigns the value of the set (it's like an assignment of the variable) then it all makes sense. This is how INSERTs in Scylla and Cassandra work for all types so it's consistent. After all, that's the most natural semantic of an insert I can think of. What was your understanding of what INSERT does?
I don't look at it this way. This is the same as setting the value of the set to some value (for example {1, 2, 3}) and updating it with updates. If someone understands how mutations of data work in Scylla/Cassandra it's not a surprise at all. Data will be lost only if you do updates with timestamps smaller than the INSERT but then the behavior will be correct. It works like this for every type. If you insert value X to an int column and then try to update this column to Y but it turns out the update has a smaller timestamp than the insert then the data will be "lost" in exactly the same way as for a set. The first step to understanding the way it works is the fact that What I'm really trying to say is that this is how timestamp-based eventual consistency works in Scylla. It is consistent and shouldn't be a surprise to anyone who understands it. I agree that the model is not simple and we have to be helped users to understand it correctly but I don't think there's any issue with sets here. Everything works as expected in a way consistent with every other type. |
I agree that it may be consistent, but I should note that this behavior was... apparently not expected by the Scylla engineer who wrote this test, the Scylla engineer who helped me explore it, and the writers who wrote the CQL collection documentation. Heck, I've been working with AP databases for a little over a decade, and I failed to catch it too! There's sort of a double surprise here--CQL looks like SQL, where insert + update would be safe--and on the commutative side, if you're used to CRDTs and expecting something like a G-set + unsafe deletes built on wide rows, you might expect the insert to commute with updates as well. Maybe this is well-understood in the Cassandra/Scylla community at large! I'd like to figure out how well-understood it is, because if it's surprising, it might warrant some discussion in the Jepsen writeup. |
I don't know what to tell you except that everyone makes mistakes every now and then.
The most important point is to understand that insert is like a variable assignment. Then everything becomes easy to understand. Then what the test tries to do is (in Java):
When you think of it like this it becomes apparent that reordering these operations can lead to some of the elements missing in the set.
It's certainly worth to mention it just for the sake of users getting more educated. But it will be just a documentation of the existing correct behavior. There's no issue in the implementation or the Scylla behavior themselves. |
My three cents on this issue:
To summarize, this is not a bug in the code, but it should be considered a bug in the documentation, and fixed in the documentation, so let's not close this issue before we fix the documentation - or at least open a documentation issue. |
Yeah, I do understand the behavior now, and I agree that it is defensible based on the current documentation. I agree it's not a bug! It's more that I made some incorrect assumptions, based on how I expected CQL INSERTs and sets to behave, and based on how the current test suite was designed. It's not something I'm gonna report as a bug in the writeup, but I do think it's interesting to talk about! I'm gonna try and get a sense of how CQL users expect this to behave, and whether the same kind of use pattern that's shown in the documentation is something that users are also doing in the wild. |
I would suggest renaming the issue though to avoid misconception that something's wrong when it's not. BTW @aphyr could you please elaborate more on the issue with deletes and sets you mentioned in one of the previous comments? |
I opened a doc issue on: scylladb/scylla-doc-issues#346 |
LGTM @nyh |
Closing, reporter agrees it is not a bug. |
The following docs are updated with by scylladb/scylla-docs#3150 https://docs.scylladb.com/getting-started/dml/ |
@kostja why did you assign to me a closed issue? |
Oh, sorry, I found the right commit in that PR: https://github.com/scylladb/scylla-docs/pull/3150/commits/f5db644ee15df27480cb745867a1b70f06a6b132 |
In ScyllaDB 4.2, it appears that inserting unique elements into a CQL map or set can, when clocks are not synchronized, result in the loss of acknowledged inserts. See, for example, this Jepsen test run, in which a single client, performing approximately one insert per second to a single CQL set, loses roughly 35% of acknowledged inserts over a 30 second period--as observed by a single read performed after 20 seconds of quiescence.
This behavior appears to depend on timestamps: when we do not set a TimestampGenerator on clients, or provide clients with a TimestampGenerator which yields values directly from System.currentTimeMillis(), the behavior disappears; when we introduce jitter (on the order of 1 second) to those values, write loss appears.
Our schema for this workload is a single table with an integer primary key and a set of integer elements. We insert a single row into this table, and then have clients update that row, each updating the
elements
field to add a single unique integer. After a few seconds of these inserts, we pause for 20 seconds to allow Scylla time to recover, then perform a single read by primary key.Our randomized timestamps are generated by this generator--tuning
uncertainty-s
to zero appears to resolve the write loss issue.You can reproduce this using Jepsen's fork of the Scylla Jepsen tests by cloning 604502f and running
There may be additional options required (e.g. --user, --password, --nodes-file) depending on your Jepsen environment. See https://github.com/jepsen-io/jepsen#setting-up-a-jepsen-environment for guidance.
This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.
Installation details
Scylla version (or git commit hash): 4.2
Cluster size: 5 nodes
OS (RHEL/CentOS/Ubuntu/AWS AMI): Debian 10
Hardware details (for performance issues) Delete if unneeded
Platform (physical/VM/cloud instance type/docker): LXC
Hardware: sockets=2 cores=24 hyperthreading=2 memory=128GB
Disks: (SSD/HDD, count) SSD (shared via LXC)
The text was updated successfully, but these errors were encountered: