Here’s a complete **in-depth note** for the DBMS topic **Backup & Recovery / Recovery (Part 2)** from your transcript, covering **every** detail and the two learning outcomes:

---

# **DBMS – Backup & Recovery / Recovery (Part 2)**

## **Learning Outcomes**

1. **Understand Transactional Logging with Hot Backup**
2. **Learn about concurrent transactions and understand the recovery algorithms**

---

## **1. Transactional Logging with Hot Backup**

### **1.1 Recap**

* **Backup** is essential to preserve:

  * **Correctness** of data
  * **Consistency** of the database
* Failures can occur due to:

  * System failures
  * Power outages
  * Disk crashes
  * Other unexpected events
* Goal: Maintain **Atomicity**, **Consistency**, and **Durability** (ACD of ACID).
* Recovery methods use:

  * **Volatile storage** (RAM – lost on crash)
  * **Non-volatile storage** (HDD/SSD – survives crash)
  * **Stable storage** (Highly reliable media)

---

### **1.2 Cold Backup vs Hot Backup**

#### **Cold Backup**

* Taken **when database is not processing transactions**.
* Can be:

  * **Full** backup
  * **Differential** backup
  * **Incremental** backup

#### **Hot Backup**

* Taken **while the database is running and processing transactions**.
* Essential for:

  * Banking systems
  * Stock exchanges
  * Asset management
  * Real-time apps
* Typically uses **transaction logs** rather than full data backup (to reduce cost & size).

---

### **1.3 Why Use Transaction Logs for Hot Backup?**

* Direct hot backup of **entire database files (e.g., B+ trees)** is expensive & impractical.
* Instead:

  * Keep **cold backup** for large data files.
  * Keep **hot backup** for **transaction logs**.
* On failure:

  1. Restore from **last cold backup**.
  2. Use hot backup (transaction log) to bring DB to state just before crash.

---

### **1.4 Transaction Log Usage Protocol**

When a write request comes:

1. **First**, write the change to the **transaction log** (stored in hot backup medium like NVRAM).
2. **Then**, apply the change to the database.

#### Why?

* If crash occurs before writing to database, the log still has the change.
* During recovery:

  1. Restore cold backup.
  2. Replay logs to reapply lost changes.

---

### **1.5 Recovery vs Restore**

* **Recover**:

  * Load the latest cold backup (data files + transaction logs) into the system.
* **Restore**:

  * Replay transaction logs to bring DB to a consistent state after recovery.

**Example Steps**:

1. Crash occurs mid-backup.
2. **Recover**:

   * Load cold backup from disk/tape.
3. **Restore**:

   * Replay logs:

     * Apply any committed updates missing in data.
     * Ensure changes reflect intended final state.

**Note**:

* Vendors often replay **entire transaction log** instead of selectively applying changes—simpler and safer.

---

## **2. Recovery in Concurrent Transactions**

### **2.1 From Serial to Concurrent Transactions**

Previously: Recovery was discussed for **serial** transactions.
Now: Extend to **concurrent** transactions.

---

### **2.2 Assumption for Concurrent Recovery**

* If transaction **Ti** modifies item **A**, **no other transaction** can modify **A** until Ti:

  * **Commits** or
  * **Aborts**
* Reason: Allows clean rollback without conflict.

---

### **2.3 Buffer Management Model**

#### **Serial Case**

* Persistent storage (disk) → Main memory buffer.
* Transactions work on **private buffers** separate from main memory buffer.
* **Input** = Read from disk to buffer.
* **Output** = Write from buffer to disk.

#### **Concurrent Case**

* Disk → Shared memory buffer (may contain data from multiple transactions).
* **Each transaction** has:

  * Its own **private work area** in memory.
  * Prevents other transactions from seeing intermediate changes.

---

### **2.4 Logging Format in Normal Operation**

For each transaction **Ti**:

1. **Ti start**
2. For each update:

   * Record: **(Ti, Xj, old\_value, new\_value)**
3. On commit:

   * Record: **Ti commit**

---

### **2.5 Rollback During Normal Operation**

* If a transaction rolls back (e.g., deadlock):

  1. Scan log **backwards** from last entry.
  2. For each update record:

     * **Undo**: Write old\_value back.
     * Record a **Compensation Log Record (CLR)**:

       * Format: `(Ti, Xj, new_value → old_value, CLR)`
  3. On reaching **Ti start**:

     * Record **Ti abort**.

---

## **3. Checkpointing**

* Purpose: Limit the amount of log replay needed during recovery.
* **Checkpoint**: Moment when all modified data in memory is written to disk and a checkpoint record is added to the log.
* Types of transactions w\.r.t checkpoint & crash:

  1. **Committed before checkpoint** → No action needed.
  2. **Started before checkpoint, committed after checkpoint** → Redo.
  3. **Started after checkpoint, committed before crash** → Redo.
  4. **Started before crash but not committed** → Undo.

---

## **4. Redo-Undo Recovery Algorithm**

### **4.1 Why Redo Before Undo?**

* Even incomplete transactions’ changes must be **redone** first, so their effects can then be **undone** cleanly.
* Prevents inconsistencies from partial writes lost due to crash.

---

### **4.2 Redo Phase**

1. Find last checkpoint.
2. Initialize **Undo List** = Transactions active at checkpoint.
3. Scan forward from checkpoint:

   * If **update log record**: Apply redo.
   * If **start**: Add to Undo List.
   * If **commit/abort**: Remove from Undo List.
4. End result:

   * Database matches state at crash time.
   * Undo List = incomplete transactions.

---

### **4.3 Undo Phase**

1. Scan log backwards.
2. For each update in Undo List transactions:

   * Apply **old\_value**.
   * Write **CLR**.
3. On **Ti start**: Write **Ti abort** and remove from Undo List.
4. Stop when Undo List is empty.

---

### **4.4 Example**

Given:

* **T0 start**, update B (2000→2050)
* **T1 start**, checkpoint → Undo List = {T0, T1}
* **T1** updates C2 (700→600), commits (remove from Undo List)
* **T2 start**, update A
* **T0 rollback** (normal), write CLR for B, T0 abort
* Crash → Undo List = {T2}

Recovery:

1. **Redo Phase**:

   * Replay T2’s updates (even though incomplete).
2. **Undo Phase**:

   * Undo T2’s changes, write CLR, write T2 abort.

---

## **5. Key Points & Best Practices**

* Always **write to log before database** (Write-Ahead Logging – WAL rule).
* Maintain **hot backup** of transaction logs.
* Use **checkpoints** to shorten recovery time.
* In **redo-undo** strategy:

  * Redo **all** transactions (committed, aborted, incomplete) from checkpoint to crash.
  * Undo only **incomplete** transactions.
* Replay **entire transaction log** after cold backup for safety.