## Overview
A transaction is a facility provided by databases to combine multiple database read and writes into one logical unit executed atomically - either the entire transaction fails or it completes. It simplifies reasoning about application state since no partial failure happens.

## ACID
Safety guarantees provided by transactions can be consolidated into one acronym - ACID.

### Atomicity
In general programming terminology, atomicity in a multthreaded environment means that operation executed by a thread cannot result in a half-finished state from the point of view of other threads.

Atomicity in ACID is unrelated to concurrency. It means that when multiple writes are made by a client and a fault occurs midway, either the entire transaction is completed or none of it is.

Without atomicity it becomes difficult to determine what change went through and what didn't. Retrying in such scenario can lead to twice the number of changes than expected.

### Consistency
Similar to atomicity, consistency means different things in different contexts. In case of database transaction, it means that there are provisions to enforce invariants about data being stored, thereby ensuring that database remains valid and rule-compliant before and afer transaction.

One way is by declaring *constraints* such as foreign key constraint, uniqueness constraints, triggers, etc. However, it is also application programmers responsibility to define transactions in a way that preserve consistency.

### Isolation
Accounts for scenarios when the database is accessed by multiple clients at the same time. When multiple concurrent writes are happening, there is a possibility of race-condition.

Isolation in context of transaction means that one transaction is isolated from other. Concurrent transactions appear to have executed serially - like `synchronized` methods.

Isolation has performance penalty, therefore there are different levels of isolation provided by databases. Weaker isolation levels allow concurrent transactions to interfer with each other in limited ways but improve performance.

![Isolation levels](./images/kAIV6an.png)

### Durability
Refers to promise that once a transaction is committed, and data that was written will be retained even in case of failures like database crash. This means that system call like `fsync` is used to ensure that memory buffer is flushed and content is written to the disk

If the database is replicated, this would mean that writes are acknowledged by some number of replica nodes.

Perfect durability though doesn't exist - if the disk corrupts and backups get destroyed there is nothing a database can do.

## Isolation Levels
If only read-only statements are involved, there is no concurrency issue. Concurrency issue pops up when:
- one transaction reads data written by other
- two transactions modify the same data

## Read Committed
This isolation level promises:
- No **dirty read**: while reading data from the database, only the data that was committed is read. To simplify, any write made during a transaction is visible to others only when it commits. This ensures that data is not read in partial state. If dirty read was allowed, and the transaction was rolled back, this would mean that rolled back data was read by other process.
- No **dirty write**: while writing to the database, only data that was committed is overwritten. 

![Read Committed](./images/read_committed.png)

### Implementation
To prevent *dirty-read*, implementation is done in two ways:
1. **Shared Lock:** to read a row, database acquires a lock for it and holds it till the transaction completes. This is the older way and used by database like SQL Server.
2. **MVCC (Multi Version Concurrency Control):** database maintains multiple versions of a row. Each transaction is given a consistent view (a "snapshot") of the data at a specific point in time, determined by a transaction ID or timestamp. This is statement level MVCC, so two different `SELECT` statements within the same transaction might get different results. Postgres and Oracle utilise this method.

To prevent *dirty-write*, a transaction must aquire lock for the affected rows.

### Scenarios
**Scenario 1:** T1 executes:

In [None]:
%%sql

UPDATE employee SET department = 'Marketing' 
WHERE department = 'Sales and Marketing'; # -- Three rows identified

T2 at the same time executes and commits:

In [None]:
%%sql

INSERT INTO employee (name, department)
VALUES ('John Doe', 'Sales and Marketing');

T1's snapshot does not contain the employee data inserted by T2, so it will modify only 3 rows. If T1 executes the same `UPDATE` command again after T2 commits, then that `UPDATE` will update 1 row inserted by T2. This will happen only if T2 commits, T1 executing `UPDATE` before T2 commits will not see the new row.

**Scenario 2:** Let's say instead of inserting a row, T2 updates rows that had the same department as matched by T1:

In [None]:
%%sql

UPDATE employee SET department = 'Research'
WHERE department = 'Sales and Marketing'
AND join_date < '1994-01-01';  # -- Two rows identified

Depending upon which `UPDATE` runs first, the other transaction will have to wait. This is because the first `UPDATE` to run will acquire lock for the affected rows. The second `UPDATE` will execute only when the first releases the lock. When the second transaction acquires the lock, it reevaluates the condition and updates only the matching rows. This means that either T1 will update 3 rows or T2 will update 2 rows (and T1 1 row).

**Scenario 3:** consider two transactions executing the same set of statements:

In [None]:
%%sql

BEGIN TRANSACTION;
UPDATE account SET balance = balance + 100.00 WHERE acct_num = 123; # -- 1
UPDATE account SET balance = balance - 100.00 WHERE acct_num = 789;  # -- 2
COMMIT;

The only two possible sequence of executions are `T1(1) -> T1(2) -> T2(1) -> T2(2)` or `T2(1) -> T2(2) -> T1(1) -> T1(2)`. But what if both transactions executed the two `UPDATE` statements in different orders?

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
UPDATE account SET balance = balance + 100.00 WHERE acct_num = 123;
UPDATE account SET balance = balance - 100.00 WHERE acct_num = 789;
COMMIT;

# -- T2
BEGIN TRANSACTION;
UPDATE account SET balance = balance + 100.00 WHERE acct_num = 123;
UPDATE account SET balance = balance - 100.00 WHERE acct_num = 789;
COMMIT;

This will result in a deadlock since T1 is dependent upon T2's lock and T2 is waiting for T1 to release lock. Database like Postgres can detect this deadlock and rollback one of the transactions.

**Scenario 4:** two `UPDATE` interleaved by an `INSERT`:

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
UPDATE employee SET department = 'Marketing' 
WHERE department = 'Sales and Marketing'; # -- 1
UPDATE employee SET department = 'Marketing' 
WHERE department = 'Sales and Marketing'; # -- 2
COMMIT;

# -- T2
BEGIN TRANSACTION;
INSERT INTO employee (name, department, join_date) 
VALUES ('Morgan', 'Sales and Marketing', '2012-08-18'); # -- 1
COMMIT;

If the execution order is `T1(1) -> T2(1) -> T2(commit) -> T1(2)` then the second `UPDATE` of T1 is aware of the newly inserted row of T2.

### Issues
**Non-repeatable read:** a transaction reads the same row twice and gets different data each time because another transaction has modified (updated or deleted) the data in the meantime.

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
UPDATE employee SET salary = salary * 1.15 WHERE id = 5; # -- 1
COMMIT;

# -- T2
START TRANSACTION;
SELECT AVG(salary) FROM employee; # -- 1
SELECT MAX(salary) FROM employee; # -- 2
COMMIT;

If the order of execution is `T2(1) -> T1(1) -> T1(commit) -> T2(2)` then the average and max are not in sync with each other.

**Read skew:** related to *non-repeatable read*, here with two different queries, a transaction reads inconsistent data because between the 1st and 2nd queries, other transactions insert, update or delete data and commit.

In [None]:
%%sql

# -- Starting state
# --  player | hp
# -- --------+---------
# --   alice | 100
# --     bob | 5000
# --     mal | 10000

# -- T1
BEGIN TRANSACTION;
UPDATE players SET hp = hp - 1000 WHERE player = 'bob'; # -- 1
COMMIT;

# -- T2
BEGIN TRANSACTION;
UPDATE players SET hp = hp * 1.1 WHERE player IN ( # -- 1
    SELECT player FROM players WHERE hp >= 5000
);
COMMIT;

If the two transactions didn't interleave, then the expected result would have bob's hp as 4000 or 4500. But let's consider the case when the two transactions' execution overlap. At the start, both transactions view bob's hp as 5000. If the order of execution is `T1(1) -> T2(1) -> T1(commit) -> T2(commit)`, the `UPDATE` statement in T2 will wait till T1 commits. T1 updates hp to 4000. Then T2 would still update bob's hp (since the `SELECT` statment has already identified bob as a candidate) to 4400.

Another example involving account balance update:

In [None]:
%%sql

# -- Starting balance 500 for both accounts
# -- T1
BEGIN TRANSACTION;
SELECT balance FROM account WHERE acct_num = 123; # -- 1
SELECT balance FROM account WHERE acct_num = 789; # -- 2
# -- sum both balances 
COMMIT;

#-- T2
BEGIN TRANSACTION;
UPDATE account SET balance = balance + 100 WHERE acct_num = 123; # -- 1
UPDATE account SET balance = balance + 100 WHERE acct_num = 789; # -- 2
COMMIT;

The sum of balances of two accounts should be either 1000 or 1200. But if the execution order was `T1(1) -> T2(1) -> T2(2) -> T2(commit) -> T1(2)` then the balance can be shown as 1100.

**Phantom read:** occurs when a query is run twice with the same criteria, but returns a different set of rows the second time because another transaction has inserted or deleted rows that match the query condition in the meantime.

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
SELECT COUNT(*) FROM appointments WHERE doctor_id = 1 AND status = 'FREE'; # -- 1
SELECT COUNT(*) FROM appointments WHERE doctor_id = 1 AND status = 'FREE'; # -- 2
COMMIT;

# -- T2
BEGIN TRANSACTION;
INSERT INTO appointments (appt_id, doctor_id, appt_time, status) 
VALUES (3, 1, '2025-07-01 11:00:00', 'FREE'); # -- 1
COMMIT;

If the order of execution is `T1(1) -> T2(1) -> T2(commit) -> T1(2)` then for the same criteria, we get two different counts.

**Lost Updates:** happen when two or more transactions read the same data and then overwrite each other’s updates, causing one transaction’s changes to be silently lost.

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
SELECT value FROM metrics WHERE counter_name = 'visits'; # -- 1
# -- say the value was 100, now we need to increment it to 101
UPDATE metrics SET value = 101 WHERE counter_name = 'visits'; # -- 2
COMMIT;

#-- T2
BEGIN TRANSACTION;
SELECT value FROM metrics WHERE counter_name = 'visits'; # -- 1
# -- say the value was 100, now we need to increment it to 101
UPDATE metrics SET value = 101 WHERE counter_name = 'visits'; # -- 2
COMMIT;

Considering order of execution as `T1(1) -> T2(1) -> T1(2) -> T1(commit) -> T2(2) -> T2(commit)`, instead of value being set to 102, both set it to 101. Essentialy, increment of one transaction is lost.

**Write Skew:** occurs when two transactions read the same data, but then update different rows based on a business rule that involves both. It is a more general form of lost update.

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
SELECT COUNT(*) FROM doctors WHERE status = 'ON'; # -- 1
# -- say the count is > 2, proceed with update
UPDATE doctors SET status = 'OFF' 
WHERE name = 'Alice'; # -- 2
COMMIT;

#-- T2
BEGIN TRANSACTION;
SELECT COUNT(*) FROM doctors WHERE status = 'ON'; # -- 1
# -- say the count is > 2, proceed with update
UPDATE doctors SET status = 'OFF' 
WHERE name = 'Max'; # -- 2
COMMIT;

Both `SELECT` statements read count as 2. Both transactions continue with `UPDATE`. Now no doctor is on call - business rule violated.

## Repeatable Read
ANSI standard defines repeatable read as isolation level that disallows dirty read and non-repeatable read. It allows phantom read. Database implementations however differ from this and offer stricter definitions. This is mostly because when the standard was written, locks were the primary way isolation levels were implemented. Modern databases though also utilise MVCC.

Database like Postgres also prevent phantom reads, whereas Oracle doesn't even have repeatable read isolation level the seializable level is equivalent to Postgres' repeatable read. Due to the differences nobody really knows what exactly repeatable read level guarantees.

### Snapshot Isolation
Each transaction is given a consistent snapshot of the database - everything that was committed before the transaction began. Snapshot isolation is implemented using MVCC (Multi Version Concurrency Control). Each row is tagged with a transaction id and and there can be multiple versions of the same row each representing a unique transaction. 

Repeatable read isolation level uses transaction level MVCC as opposed to statement level MVCC used by read committed level. The difference is basically the question - "Do I take one photo at the start of the whole transaction, or a new photo for every single command?".

### Scenarios
**Scenario 1:** first transaction inserts a new row and commits. At the same time the second transaction executes `UPDATE` statement:

In [None]:
%%sql

# -- T1
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO employee (name, department)
VALUES ('John Doe', 'Sales and Marketing'); # -- 1
COMMIT;

# -- T2
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
UPDATE employee SET department = 'Marketing' 
WHERE department = 'Sales and Marketing'; # -- 1
COMMIT;

Given both transactions begin at the same moment and the execution order is `T1(1) -> T1(commit) -> T2(1)`, the second transaction's `UPDATE` will be aware of the newly inserted row. However, if the execution order is `T1(1) -> T2(1) -> T1(commit)` then in this case the new row is not visible to T2. The snapshot is decided when the first statement is executed, (not `BEGIN`).

If we insert a `SELECT * FROM employee` statement before `UPDATE` and the order of execution is `T1(1) -> T2(1) -> T1(commit) -> T2(2)` then also, the newly inserted row will not be visible to T2.

**Scenario 2:** what happens when two concurrent transactions T1 and T2 update the same set of rows? As in case of read committed, repeatable read mode also utilises write locks. Therefore one of the transactions has to wait for the other to commit. However, in this case T2's view of the database doesn't reflect the changes just made by T1. Database like Postgres detects that T2’s view of the world is now invalid (because it missed T1’s committed changes but touched overlapping data). This results in an error being thrown (Postgres returns "ERROR: could not serialize access due to concurrent update") and T2 is rolledback. The ideal scenario would be to retry T2. Which is why the advise is to retry transactions on failure when repeatable read mode is used. Lost updates therefore doesn't impact repeatable read mode.

**Scenario 3:** *lost update* is not an issue in repeatable read mode as conflicting writes result in error as stated above. One transaction will error out in the example below:

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT value FROM metrics WHERE counter_name = 'visits'; # -- 1
# -- say the value was 100, now we need to increment it to 101
UPDATE metrics SET value = 101 WHERE counter_name = 'visits'; # -- 2
COMMIT;

#-- T2
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT value FROM metrics WHERE counter_name = 'visits'; # -- 1
# -- say the value was 100, now we need to increment it to 101
UPDATE metrics SET value = 101 WHERE counter_name = 'visits'; # -- 2
COMMIT;

**Scenario 4:** *phantom read* is also not an issue. The two `SELECT` statments return the same value even though the `INSERT` statement was committed in between:

In [None]:
%%sql

# -- T1
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT SUM(balance) FROM accounts; # -- 1
SELECT SUM(balance) FROM accounts; # -- 2
COMMIT;

#-- T2
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO accounts (acct_num, balance) VALUES (456, 150.0); # -- 1
COMMIT;

### Issues
**Write skew:** is still an issue even with repeatable repeat isolation level.

In [None]:
%%sql

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM bookings 
    WHERE room_id = 12 
    AND start_time >= '2025-05-04T12:00' 
    AND end_time <= '2025-05-0T13:00';
# -- If previous query returned 0, then proceed with insert statement below:
INSERT INTO bookings (room_id, start_id, end_time) VALUES (12, '2025-05-04T12:00', '2025-05-0T13:00');
COMMIT;

In the above example, not even `SELECT FOR UPDATE` construct would help as it would lock 0 rows.

## Serializable
Strictest isolation level, emulating serial transaction execution for all committed transactions; as if transactions had been executed one after another, serially, rather than concurrently. Serializable mode is implemented in the following ways:  
1. **Serial execution:** run transactions in a single thread.
2. **Two phased locking:** stricter form of locking which allows multiple transactions to read, however writing requires exclusive lock. This means that,
    - If transaction A reads an object and transaction B wants to write to same object, then B must wait till A commits (or aborts)
    - If transaction A has written to an object and transaction B wants to write to it, B must wait till A commits (or aborts)
   Therefore, writers block other writers and readers. This is implmented by having locks operate in two modes -

   ![Lock Modes](./images/lock_mode.png)