Here’s a complete **in-depth note** for the topic **DBMS – Backup & Recovery / Recovery / 1**
(based fully on your transcript and the stated learning outcomes).

---

# **DBMS – Backup & Recovery / 2: Recovery / 1**

## **In-depth Notes**

---

## **1. Introduction**

* In the previous module, **backup requirements** and strategies (cold/hot backups, schedules) were discussed.
* Cold backup is relatively straightforward, but **hot backup** of transaction logs is essential to recover from failures **at any point** and maintain **database consistency**.
* This module focuses on:

  1. **Possible sources of failure** for transactions.
  2. **Recovery strategies** to maintain **Atomicity, Consistency, Durability** (ACD of ACID).
  3. **Log-based recovery** methods.
  4. Considering **only a single transaction at a time** (serial case).

     * Concurrent transaction case is covered in the next module.

---

## **2. Failure Classification**

### **ACID Review**

* **Atomicity** → “All or nothing” – transaction should be fully done or not at all.
* **Consistency** → Preserve database integrity rules.
* **Isolation** → Concurrent transactions behave as if they run sequentially.
* **Durability** → Changes survive failures once transaction commits.

**Contribution of components:**

* **Concurrency control** → Ensures isolation → helps maintain consistency.
* **Application program** → Must ensure logical correctness.

  * E.g., faulty logic: Debit ₹50 from Account A but credit only ₹40 to B → No recovery strategy can fix this.
* **Recovery subsystem** → Ensures atomicity & durability → helps consistency.

---

### **Types of Failures**

1. **Transaction Failures**

   * **Logical errors**: E.g., application code error, invalid input.
   * **System errors**: E.g., transaction aborted due to deadlock.

2. **System Crash**

   * Cause: Power failure, OS crash.
   * **Fail-stop assumption**: Non-volatile storage (disk) contents are assumed **not corrupted** by a crash.
   * Disks are designed with safeguards.
   * But **disk hardware failure** can still happen (e.g., head crash).

3. **Disk Failure**

   * **Head crash** or other hardware issue destroying all/part of disk storage.
   * Usually detectable via **checksums** (hash values to verify block integrity).

---

## **3. Recovery Strategy – Overview**

When a failure occurs:

* **Scenario**: Transaction updates only part of the data before failure.

  * Example: Transfer ₹50 from A to B.

    * If only A is debited before crash → inconsistent state.
* **Commit delay issue**: Physical writing to disk takes time; failure may occur between logical commit and actual disk update.

**Two Phases in Recovery Strategy:**

1. **During normal processing**:

   * Record enough information (in logs) to allow recovery later.
2. **During recovery**:

   * Use recorded info to restore database to a consistent state.

Goal: Maintain **Atomicity, Consistency, Durability**.

---

## **4. Types of Storage in Recovery**

1. **Volatile Storage**

   * Loses content on crash.
   * Examples: Main memory (RAM), CPU cache.

2. **Non-volatile Storage**

   * Survives system crash.
   * Examples: Disk, SSD, magnetic tapes, battery-backed RAM.

3. **Stable Storage** (Idealized concept)

   * Survives **all failures** (system crash, disk crash, disasters).
   * Implemented by **multiple copies** on distinct non-volatile media (possibly at remote sites).
   * **Challenges**:

     * Synchronizing multiple copies.
     * Detecting faulty copies.
     * Higher latency for remote updates.

---

### **Output Operations with Multiple Copies**

* Example: Two copies of each block.
* **Write process**:

  1. Write to first copy.
  2. If successful, write to second copy.
  3. Output completes only after second write succeeds.
* Failures cause **inconsistent copies**.

**Detecting inconsistencies:**

* Comparing all copies (expensive).
* Better: Maintain a record of **in-progress writes** in **non-volatile memory**.

  * Helps identify which blocks need recovery.
  * Common in RAID systems.

---

### **Stable Storage Recovery**

When inconsistent blocks are detected:

1. If one copy fails checksum → overwrite it from the good copy.
2. If both copies valid but differ → choose one (e.g., overwrite second with first).

---

## **5. Data Access Protocol**

* **Physical blocks** (on disk) ↔ **System buffer blocks** (in main memory).
* **Transactions** operate on **private local buffers**:

  * **Read X** → Copy from system buffer to transaction’s local buffer.
  * **Write X** → Copy from local buffer to system buffer.
* **Output Bx** → Write system buffer block back to disk.

---

## **6. Log-Based Recovery**

### **Basic Concept**

* **Log file** records every change to database items.
* Used to **undo** or **redo** actions during recovery.
* **Log record format**:

  ```
  <Transaction ID, Data Item, Old Value, New Value>
  ```
* **Transaction start record**: `<Ti start>`
* **Transaction commit record**: `<Ti commit>`

---

### **Immediate vs Deferred Modification**

1. **Immediate Modification**

   * Changes may be written to disk **before** commit.
   * **Requirement**: Log record must be written to stable storage **before** the change is written to disk.

2. **Deferred Modification**

   * No changes are written until commit.
   * Simpler recovery, but needs more local storage.

**This module focuses on** **Immediate Modification**.

---

### **Commit Definition**

* A transaction is committed when:

  1. Its **commit log record** is written to stable storage.
  2. All its previous log records have been output.
* Actual data writes to disk may occur **before or after commit**.

---

### **Example Log**

```
<T0 start>
<T0, A, 1000, 950>
<T0, B, 2000, 2050>
<T0 commit>
<T1 start>
<T1, C, 700, 600>
...
```

* Writes to disk can happen before or after commit due to logging flexibility.

---

## **7. Undo & Redo Operations**

* **Undo**: Restore old value from log (`old_value → X`).
* **Redo**: Reapply new value from log (`new_value → X`).

### **Usage**

* **During rollback**: Only **undo** is needed.
* **During crash recovery**:

  * **Undo** incomplete transactions.
  * **Redo** committed transactions whose updates may not be on disk.

---

### **Rules for Failure Recovery**

* If `<Ti start>` exists but **no** `<Ti commit>` / `<Ti abort>` → **Undo**.
* If `<Ti start>` and `<Ti commit>` exist → **Redo**.

---

## **8. Checkpoints**

* Periodic points where:

  1. Stop all updates.
  2. Flush log buffer to stable storage.
  3. Write all modified data blocks to disk.
  4. Write `<checkpoint>` record (with list of active transactions).

**Advantages**:

* Recovery need not scan the entire log → start from last checkpoint.

---

### **Checkpoint Example**

* Failure occurs after checkpoint.
* **Ignore** transactions committed before checkpoint.
* Redo committed transactions after checkpoint.
* Undo transactions active at failure time.

**Trade-off**:

* Frequent checkpoints → faster recovery, but lower throughput.
* Infrequent checkpoints → higher recovery cost, but better runtime performance.

---

## **9. Summary**

* **Sources of failure**: Transaction, System, Disk failures.
* **Storage types**: Volatile, Non-volatile, Stable (via replication).
* **Recovery strategy**: Record during execution, recover after failure.
* **Log-based recovery**: Maintains old/new values for undo/redo.
* **Immediate modification**: Allows early writes with proper logging.
* **Undo/Redo rules**: Based on commit/abort records.
* **Checkpoints**: Limit log scanning during recovery.