Skip to content
mrks edited this page Aug 31, 2018 · 4 revisions

Multi-Version Concurrency Control (MVCC)

We use MVCC to isolate transactions. MVCC is a method to allow multiple transactions to work on the same table concurrently while still adhering to the ACID criteria. This means that every transaction gets the impression that it alone works on the table (isolation). Changes made by other transactions are only visible if they have been committed before the transaction was started. Changes that were done in between are not visible, no matter if they are committed or not. As a result, every transaction has a consistent view of the database. If a transaction commits, its changes will be made visible to following transactions all-or-nothing (atomicity).

This is done by storing multiple versions of the same row. Instead of updating a row, it is marked as invalid and the modified data is inserted as a new row that is only visible to new transactions. This means that transactions that were started before the modifying transactions was committed can still access the version that was valid when they started. If two transactions try to modify the same row, this is caught and one of the transactions has to abort.

Our implementation of MVCC

We use two global counters, managed by the TransactionManager, that are atomically incremented:

  • The _next_transaction_id, which is the transaction id that is given to the next transaction that is started
  • The _last_commit_id, which is the commit id of the last transaction that committed. As every transaction has a transaction id, but not all transactions commit, _last_commit_id could be smaller than _next_transaction_id.

Each transaction holds a TransactionContext which holds the information needed to validate and update MVCC rows. This transaction context is accessible by the read/write operators (insert and delete - update just calls these two) and by the validate operator (which removes all rows from a table that are invisible to the current transaction). On a conceptual level, we store three pieces of information per transaction:

  • The _transaction_id (see above)
  • The _snapshot_commit_id (the commit id that was the global _last_commit_id at the time the transaction started)
  • If the transaction is in the commiting phase, its commit id. This commit id is already sequenced, meaning that the transaction with the lower commit id finishes first, but only once the commit phase is completed does the commit id of the transaction become the global _last_commit_id.

Each chunk stores three pieces of information per row:

  • begin_cids (short for begin commit-ids) store the commit id of the transaction that inserted this row. For a row that was inserted but not committed, this is infinity (well, INT_MAX). Transactions whose _snapshot_commit_id is smaller than the begin cid will not see this row.
  • end_cids is the same but for the commit that invalidated a row. Transactions whose _snapshot_commit_id is larger than the begin cid will not see this row.
  • tids, or Transaction IDs mark modifications by ongoing transaction. A row that is inserted but not committed, as well as a row that is invalidated, but not committed, hold the _transaction_id of the transaction that modified the row. This negates the logic described above - a row that was not yet finally inserted and committed is visible to the transaction that inserted it, and a row that has not yet been finally deleted is already invisible for the deleting transaction.

Evaluating the visibility

Tables that are loaded into the query plan using the GetTable operator contain all entries, both valid and invalid. One of the first steps is to remove all invisible rows by adding the Validate operator to the query plan. It uses the following truth table to determine if a table should be visible or not:

tids[row] == tx.tid tx.snapshot_cid >= begin_cids[row] tx.snapshot_cid >= end_cids[row] Visible? Comment
yes yes yes No Own Delete, committed
no yes yes No Past Delete (or deleted by self)
yes no yes No Impossible
yes yes no No Own Delete, uncommitted
no no yes No Impossible
yes no no Yes Own Insert
no yes no Yes Past Insert or Future Delete
no no no No Uncommitted Insert or Future Insert

Row Lifecycle

A row goes through the following life cycle from memory allocation to deletion:

tids[row] begin_cids[row] end_cids[row] TransactionManager._last_commit_id Status
-- Inf Inf 12 Init
5 Inf Inf 12 Insert uncommitted
-- 13 Inf 12 Insert commit in prog.
-- 13 Inf 13 Insert committed
6 13 Inf 13 delete uncommitted
6 13 14 13 delete commit in prog.
6 13 14 14 delete committed

If a row gets deleted in the same transaction that inserted it, the tid gets reset to zero, so that future reads will not see it.

You can’t perform that action at this time.