Multi-Version Concurrency Control (MVCC)
We use MVCC to isolate transactions. MVCC is a method to allow multiple transactions to work on the same table concurrently while still adhering to the ACID criteria. This means that every transaction gets the impression that it alone works on the table (isolation). Changes made by other transactions are only visible if they have been committed before the transaction was started. Changes that were done in between are not visible, no matter if they are committed or not. As a result, every transaction has a consistent view of the database. If a transaction commits, its changes will be made visible to following transactions all-or-nothing (atomicity).
This is done by storing multiple versions of the same row. Instead of updating a row, it is marked as invalid and the modified data is inserted as a new row that is only visible to new transactions. This means that transactions that were started before the modifying transactions was committed can still access the version that was valid when they started. If two transactions try to modify the same row, this is caught and one of the transactions has to abort.
Our implementation of MVCC
We use two global counters, managed by the
TransactionManager, that are atomically incremented:
_next_transaction_id, which is the transaction id that is given to the next transaction that is started
_last_commit_id, which is the commit id of the last transaction that committed. As every transaction has a transaction id, but not all transactions commit,
_last_commit_idcould be smaller than
Each transaction holds a
TransactionContext which holds the information needed to validate and update MVCC rows. This transaction context is accessible by the read/write operators (insert and delete - update just calls these two) and by the validate operator (which removes all rows from a table that are invisible to the current transaction). On a conceptual level, we store three pieces of information per transaction:
_snapshot_commit_id(the commit id that was the global
_last_commit_idat the time the transaction started)
- If the transaction is in the commiting phase, its commit id. This commit id is already sequenced, meaning that the transaction with the lower commit id finishes first, but only once the commit phase is completed does the commit id of the transaction become the global
Each chunk stores three pieces of information per row:
begin_cids(short for begin commit-ids) store the commit id of the transaction that inserted this row. For a row that was inserted but not committed, this is infinity (well,
INT_MAX). Transactions whose
_snapshot_commit_idis smaller than the begin cid will not see this row.
end_cidsis the same but for the commit that invalidated a row. Transactions whose
_snapshot_commit_idis larger than the begin cid will not see this row.
tids, or Transaction IDs mark modifications by ongoing transaction. A row that is inserted but not committed, as well as a row that is invalidated, but not committed, hold the
_transaction_idof the transaction that modified the row. This negates the logic described above - a row that was not yet finally inserted and committed is visible to the transaction that inserted it, and a row that has not yet been finally deleted is already invisible for the deleting transaction.
Evaluating the visibility
Tables that are loaded into the query plan using the
GetTable operator contain all entries, both valid and invalid. One of the first steps is to remove all invisible rows by adding the
Validate operator to the query plan. It uses the following truth table to determine if a table should be visible or not:
|tids[row] == tx.tid||tx.snapshot_cid >= begin_cids[row]||tx.snapshot_cid >= end_cids[row]||Visible?||Comment|
|yes||yes||yes||No||Own Delete, committed|
|no||yes||yes||No||Past Delete (or deleted by self)|
|yes||yes||no||No||Own Delete, uncommitted|
|no||yes||no||Yes||Past Insert or Future Delete|
|no||no||no||No||Uncommitted Insert or Future Insert|
A row goes through the following life cycle from memory allocation to deletion:
|--||13||Inf||12||Insert commit in prog.|
|6||13||14||13||delete commit in prog.|
If a row gets deleted in the same transaction that inserted it, the tid gets reset to zero, so that future reads will not see it.