-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot Isolation may not be as isolated as one would like #336
Comments
The first committer wins rule is not honored by galera certification, this has been optimized for performance reasons, at some point. However, it is possible to support SI and make it configurable per session, so we should leave this as a potential task for some future release. Serializable transactions are supported only if all writes are targeted to one node. Any replication coming from other nodes will not honor read locks held by serializable transaction. Achieving true serializability would require populating also read sets in Galera replication. This would cause major impact for performance, but again it would be configurable per session, and a potential new task for future, as well. Thanks for the test case, and detailed bug report! |
That's... interesting. So when the docs say
Does that mean Galera should provide snapshot isolation? Without first-committer-wins, you can't really have snapshot isolation, right? Does Galera actually provide some weaker isolation level--e.g. snapshot isolation plus read skew, and, if so, should the docs be changed? |
Adjusting the concurrency and duration of the test, I can now confirm that transfer transactions are also affected: not only can read-only transactions see intermediate results from transfers, but transfers can write back intermediate results successfully, causing the permanent creation or destruction of money in the system. Here are two test cases; in the first, the total rises from $20 to $22, and in the second, it falls to $15. I believe this kind of data corruption is a more serious issue than inconsistent pure-read transactions--it may be a good idea to revise your documentation to reflect the possibility of data corruption for users relying on Snapshot Isolation's invariants. |
By trading the "first committer wins rule" for performance, Galera does not any more provide SI. We need to change the documentation accordingly. |
@aphyr
so running test in SERIALIZABLE mode is essentially the same as running the test in REPEATABLE READ mode with SELECT statements modified to LOCK IN SHARE MODE. After some experimenting I found out that setting transaction isolation mode to REPEATABLE READ made bank test pass, but making any of the SELECT statements to LOCK IN SHARE MODE explicitly made the test fail. Also, setting transaction isolation mode to SERIALIZABLE and doing SELECTs in FOR UPDATE mode (see https://dev.mysql.com/doc/refman/5.6/en/innodb-locking-reads.html for explanation) made the test pass. So it seems that something in galera/mysql-wsrep implementation (probably in prioritized transaction processing) does not honor shared locks properly or the effect of brute force abort caused by prioritized transaction is not propagated properly for locally executing (so far read only) transaction. Concurrency level in tests was 50 concurrent clients, so I'm quite confident that the success result is valid. Edit: My changes to jepsen can be found from: https://github.com/codership/jepsen/tree/galeracluster |
The following simple MTR test case demonstrates that read-only transaction with SERIALIZABLE isolation level will normally abort if conflicting high priority (certified) transaction commits before it.
|
How's the progress on this issue? |
Any further updates? I am evaluating MariaDB TX and would love to know if this situation has changed. |
@sjaakola friendly ping as this seems like an important matter |
@LifeIsStrange |
I might be wrong about this, and I'd like some help checking my work--but it looks like MariaDB + Galera Cluster, version 7.4.7 from the MariaDB Debian Jessie repos, may not correctly enforce snapshot isolation between transactions.
This Jepsen test sets up a five node Galera cluster and runs a simple simulation of transfers between two bank accounts, both starting with
balance = 10
.We then have several clients, connected to all 5 nodes, execute a mix of read transactions and balance-transfer transactions at SERIALIZABLE. Read transactions simply read all balances:
Transfer transactions read the balance of both accounts, move some from one to the other, and, if both remain positive, write back the new balances. This one transfers
0
, and writes10
and10
respectively. Note that the write set always covers the read set, and there are only two rows in the table to begin with--I don't think we should be subject to write skew anomalies here, and phantoms shouldn't be an issue.As I understand the isolation level documentation, Galera Cluster should provide Serializable isolation locally and Snapshot Isolation between nodes. Since SI is higher than RR, I assume you have to set
ISOLATION LEVEL SERIALIZABLE
to actually obtain SI.As I understand SI, given fixed working sets and with write sets that cover reads, all balances should sum to the same value. As a test invariant, we measure that every read transaction sees a consistent snapshot--in this case, that the balances always sum to 20. This works for low levels of concurrency, but with 50 concurrent clients, we start seeing anomalous reads. For instance:
See it? Process 0 reads the balances
2
and17
, which only sum to19
. A correct value, read by process4
, might have been[3 17]
, or the subsequent[7 13]
. There are a lot of transactions in flight at any given point in this log, so it's a little hard to reconstruct the order of events. :(Transfer transactions see inconsistent reads as well. For instance, this transaction only sees a total balance of 17, not 20, and attempts to write
[9 8]
.This particular commit conflicts and hits a rollback--and in fact, every transfer transaction I've analyzed so far appears to roll back, which suggests write promotion is doing the right thing here, and only read transactions fail to serialize appropriately. I'm struggling to understand whether this qualifies as an instance of http://www.cs.umb.edu/~poneil/ROAnom.pdf, or if these kinds of reads should be prevented in this case.
The complete logfiles from each host, including query logs, and the history of operations from Jepsen, plus analysis showing all inconsistent reads, are available here.
Do you have any suggestions here? Have I misunderstood Galera's guarantees? Or is there additional debugging information I can provide?
The text was updated successfully, but these errors were encountered: