# Introduction

- a database must ensure ACID properties despite concurrent execution of transactions and the possibility of failures

- **concurrency control** is the mechanism that provides transaction isolation, whose purpose is to avoid **isolation anomalies**
    - *phantom reads*
    - *non-repeatable reads*
    - *dirty reads*


- serializable isolation prevents all three types of anomalies, weaker isolation levels are also defined in SQL-92
    - *repeatable read*
    - *read committed*
    - *read uncommitted*


# Pessimistic vs Optimistic Protocols

- there are two paradigms for designing concurrency control protocols



- **Pessimistic**: check for synchronization conflicts during transaction execution, and if a conflict is detected, then **delay** a data operation or **abort** an entire transaction
    - avoids wasted work when *conflicts occur frequently*
    - example 1: two-phase locking protocol (delay on conflict)
    - example 2: timestamp ordering protocol (abort on conflict)


- **Optimistic**: allow transactions to proceed unchecked and detect synchronization conflicts when a transaction attempts to commit
    - reduces overhead when *conflicts occur infrequently*
    - example: backward validation protocol





# Lock-Based Protocols

- a **lock** is a mechanism for restricting concurrent access to a data item (e.g., table row, range of rows, entire table, etc)


- assuming data items are table rows and can be locked in only two modes
    1. **exclusive (X)** mode
        - transaction that owns the lock can *read or write* the data item
        - X-lock is requested using the **lock-X** instruction
    2. **shared (S)** mode
        - transaction that owns the lock can *only read* the data item
        - S-lock is requested using the **lock-S** instruction


- transactions request locks from a lock manager, the lock manager may force the transaction to wait until the lock is released, or it may decide to abort the transaction (e.g., to avoid deadlock)


- to show which combinations of locks can be granted at the same time and which cannot, the **lock-compatibility matrix** for shared and exclusive locks:

<img src="img/Snip20191123_14.png" width=30%/>

- a transaction may be granted a lock on an item if the requested lock is compatible with all locks already held on that item by other transactions
    - any number of transactions can hold S-locks on an item
    - if any transaction holds an X-lock on the item, no other transaction may hold any lock on the item

<img src="img/Snip20191123_15.png" width=60%/>

- **locking does not guarantee serializability**
- if A or B is updated in-between the read of A and the read of B, then the displayed sum would be incorrect

- a **locking protocol** is a set of rules followed by all transactions while requesting and releasing locks, locking protocols restrict the set of possible schedules

## Deadlock

<img src="img/Snip20191123_16.png" width=60%/>

- neither $T_3$ nor $T_4$ can make progress, executing `lock-S(B)` causes $T_4$ to wait for $T_3$ to release its lock on B; while executing `lock-X(A)` causes $T_3$ to wait for $T_4$ to release its lock on A
- to resolve such deadlock, one of $T_3$ or $T_4$ must be rolled back and its locks released

## Starvation

- a transaction does not make progress because it cannot obtain a particular lock
    - e.g., a transaction may be waiting for an X-lock on an item, while a sequence of other transactions request and are granted an S-lock on the same item
- in practical terms, starvation means that a transaction appears to hang, similarly to deadlock
- an ideal lock manager prevents both deadlock and starvation


# The Two-Phase Locking Protocol

- **Two-phase locking (2PL)**: a popular pessimistic protocol that ensures conflict-serializable schedules
- high-level idea
    - **phase 1**: growing phase
        - transaction may acquire locks, but may not release locks
    - **phase 2**: shrinking phase
        - transaction may release locks but not acquire locks
- the sequence of instructions executed by a transaction determines which items must be locked and in what mode (e.g., request X-lock before writing an item)



- to access a data object, a transaction must hold a sufficiently strong lock on that object; an exclusive lock must be held for writing, but a shared lock is sufficient for reading
- the protocol allows **lock conversions**
    - during the growing phase a transaction may **upgrade** a shared lock to an exclusive lock without releasing the lock
    - during the shrinking phase a transaction may **downgrade** an exclusive lock to a shared lock without releasing the lock


- "two-phase" means that after a transaction releases or downgrades any lock, it cannot acquire or upgrade the same lock or any other lock
    - all `lock-s`, `lock-x`, and `upgrade` instructions must precede any `unlock` and `downgrade` instructions within the same transaction


- the protocol does not specify exactly when locks are acquired or in what mode; in principle 2PL allows locks to be acquired long before they are needed, and release long after they are no longer needed

## 2PL Properties

- **deadlock is possible under 2PL**
- **cascading rollback is possible under 2PL**


- to avoid cascading rollback, one can follow a modified protocol called **strict two-phase locking (S2PL)**
    - a transaction *holds* all its exclusive locks until it commits or aborts
    - ensures schedules are both cascadeless and recoverable


- many database systems actually use a slighlt stronger protocol called **rigorous two-phase locking (R2PL)**
    - a transaction holds all its locks (shared or exclusive) until it commits or aborts
    - compared to S2PL, parallelism is reduced, implementation is simplified



# Automatic Acquisition of Locks

- using **automatic acquisition of locks**, a transaction $T_i$ issues the usual read/write instructions without explicit locking calls


- the instruction `read(D)` is processed as

<img src="img/Snip20191123_17.png" width=80%/>

- the instruction `write(D)` is processed as

<img src="img/Snip20191123_18.png" width=80%/>

- note: all locks are released after commit or abort (for rigorous 2PL)


# Multiple Granularity Locking

- allow data items to be of various sizes and define a hierarchy of data granularities
- **when a transaction locks a node in the tree explicitly, it *implicitly* locks all the node's descendants in the *same* mode**


- **Granularity of locking**: the level in the hierarchy, represented as a tree, where locking is done
    - **fine granularity** (lower in the hierarchy): greater concurrency, higher locking overhead (e.g., locking an individual record)
    - **coarse granularity** (higher in the hierarchy): less concurrency, lower locking overhead (e.g., locking the entire database)

<img src="img/Snip20191123_19.png" width=80%/>

# Intention Lock Modes

- in addition to S and X lock modes, there are three additional lock modes to support multiple granularity
    - **intention-shared** (IS): indicates intent to lock explicitly at a lower level of the tree but only with shared locks
    - **intention-exclusive** (IX): indicates intent to lock explicitly at a lower level of the tree with exclusive or shared locks
    - shared and intention-exclusive (SIX): locks the subtree rooted at a given node explicitly in shared mode and indicates intent to lock explicitly at a lower level with exclusive locks

- [***Memorize***] intention locks allow a higher level node to be locked in S or X mode without having to check all descendent nodes

<img src="img/Snip20191123_20.png" width=50%/>

# Multiple Granularity: Protocol Details

- transaction $T_i$ can lock a node $Q$ in the tree using the following rules
    1. the lock compatibility matrix must be observed
    2. the root of the tree must be locked first, and may be locked in any mode
    3. a node $Q$ can be locked by transaction $T_i$ in S or IS mode only if the parent of $Q$ is currently locked by $T_i$ in either IX or IS mode
    4. a node $Q$ can be locked by transaction $T_i$ in X, IX, or SIX mode only if the parent of $Q$ is currently locked by $T_i$ in either IX or SIX mode
    5. Transaction $T_i$ can lock a node only if it has not previously unlocked any node (same as in 2PL)
    6. Transaction $T_i$ can unlock a node $Q$ only if none of the children of $Q$ are currently locked by $T_i$
    

- **locks are acquired in root-to-leaf order; whereas they are released in leaf-to-root order**

- **lock granularity escalation**: in case there are too many locks at a particular level, it is possible to switch to a higher granularity S or X lock

# Implementation of Locking 

- transactions send lock and unlock requests to a **lock manager**
- the lock manager replies to a lock request with a lock grant message, or a message asking the transaction to roll back instead (to avoid deadlock)
- the requesting transaction waits until its request is answered
- the lock manager maintains a data-structure called a **lock table** to record granted locks and pending requests
- the lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked

## Lock Table

<img src="img/Snip20191123_21.png"/>