New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow relaxing isolation guarantees on per-table basis #2379
Comments
Is there even row isolation? If two concurrent updates have the same timestamp, the values picked for the row cells could be a mix from the two updates. |
@duarten That's one of the cases which may violate this. There is also the case of repair breaking isolation: http://stackoverflow.com/a/41683109/246755 We inherited this from Cassandra. It's been discussed, and I think the conclusion was that we're not confident enough to give up on this entirely, so we try to keep the isolation for cases when those known issues are not encountered. |
Maybe we should revisit the discussion. I mean, we can't provide row isolation in the presence of concurrent updates, which is the exact scenario where users would potentially rely on the feature. |
Note that the problem of providing read isolation exists even if there are no concurrent updates, but also when there are updates touching multiple rows, e.g. via batches. The reader could execute partly before the update and partly after the update. It could see partial update if it pauses between the updated rows. As for the problem of timestamp aliasing, not all kinds of workload with concurrent updates could be able to trigger that problem. If the conflicting updates are relayed from a single client with client-side timestamps, conflict wouldn't happen. There could be a failover to another client after a timeout, but that timeout would offset timestamps significantly far in the future. |
Dirty reads can only be solved if dirty writes are too (if a write operation runs concurrently with a logged batch statement, then it may well conflict with a row and cause a dirty write). I also wonder what happens to batches in case of failures. If they are partially applied locally (let's say two statements update different tables), then when the node recovers it may allow dirty reads. Here it says:
I really wonder if we should strive to ensure isolation when it can be broken by something as fundamental as read repair, like you pointed out. The guidance for these scenarios should be to use LWT. Finally, note you don't even need concurrent updates: a write can complete with a timestamp in the past such that it conflicts with an already completed one. |
My lawyers inform me that we can drop the guarantee for any query that has paging enabled. |
@avikivity why is that? isolation doesn't have to hold across pages, but don't we have to ensure that each page sees a snapshot of partition? |
It's legal for a page to be one row long, regardless of the page size requested. In fact we will return single row pages if the rows are large. |
So your point here is that since the consumer doesn't know how large the page will be, he can't make any sensible use of the guarantee? There may be something to it. |
Right. According to the lawyers, it's still possible that the consumer may be using their own driver, or talking the cql binary protocol directly, and thus treat pages specially. But realistically the consumer has the pages hidden by the driver, it's one long response for them. Each page boundary breaks isolation, but the page breaks themselves are hidden. |
the page breaks are not hidden the client, the drivers expose that:
http://docs.datastax.com/en/drivers/java/3.0/com/datastax/driver/core/ResultSet.html#getAvailableWithoutFetching
The docs even have samples on how todo prefetches based on the amount left
in the resultset.
However, applications that are aware of that - I am not sure those exist.
…On Sat, May 13, 2017 at 10:55 AM, Avi Kivity ***@***.***> wrote:
Right. According to the lawyers, it's still possible that the consumer may
be using their own driver, or talking the cql binary protocol directly, and
thus treat pages specially. But realistically the consumer has the pages
hidden by the driver, it's one long response for them. Each page boundary
breaks isolation, but the page breaks themselves are hidden.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2379 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADThCFtyZbkznhMp5XBRixqtgoP2_tYHks5r5WHjgaJpZM4NZLGl>
.
|
Another case that breaks isolation guaranty between multiple rows is
read repair in latest Cassandra (the problem did not exists in Origin).
What Scylla and old Origing do when during read repair there is not
enough data left after reconciliation is re-requesting the same data but
ask for more rows. Latest Cassandra just ask for more rows from the last
key it already has, so reads are no longer atomic.
See DataResolver.java:ShortReadProtection::moreContents() in Cassandra
for where it happens.
…--
Gleb.
|
https://issues.apache.org/jira/browse/CASSANDRA-10701 Changes logged batches guarantees from "atomic" to "eventually". With this, there's no point in providing snapshot isolation. |
But: it still has
|
@avikivity I think this ticket only tries to reduce confusion caused by the docs by avoiding use of the word "atomic", which has an overloaded meaning in the context of "atomic batch", without changing any guarantees. This [1] old blog post already tried to clarify that:
[1] http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2 |
Currently we aim at providing snapshot isolation for partition reads.
For very large partitions maintaining this is challenging and may impact performance and stability. See for example #1938.
There may be many workloads which have large partitions but do not need per-partition read isolation. Those would benefit from relaxed guarantees.
We could allow relaxing this to a row-level isolation using table property recorded in the schema.
The text was updated successfully, but these errors were encountered: