Here’s an **in-depth summary note** on **L7.4: SQL vs. NoSQL**, covering **all concepts and subtopics** explained in the lecture:

---

# 📘 Lecture 7.4: SQL vs NoSQL — In-Depth Summary

---

## 🧠 Overview

The lecture explores how traditional **SQL-based relational databases (RDBMS)** compare with modern alternatives like **NoSQL databases**, highlighting various data storage models, trade-offs in consistency, indexing strategies, and data structure principles like **ACID** and **eventual consistency**.

---

## 🔷 1. SQL and RDBMS (Relational Databases)

### ✅ What is SQL?

* **SQL**: Structured Query Language, used to **query structured data**.
* SQL is not tied to a specific storage engine; even **Google Sheets** supports a SQL-like interface.
* **Main use case**: Querying **relational databases (RDBMS)**.

### 🗂️ Structure of RDBMS

* Data is stored in **tables** (rows and columns).
* **Each row** (tuple) must have the **same columns** (uniform schema).
* **Relationships** between entities stored using **foreign keys**.

### 💡 Advantages:

* Easy to **structure** and **index**.
* High performance for structured data with **predefined schemas**.
* **Indexing** improves query speed on specific columns.

### ⚠️ Limitations:

* Inflexible when schema varies across entries.
* Example: student database where students are either **hostellers** or **day scholars**:

  * Hostellers need mess info.
  * Day scholars need gate pass info.
  * This leads to many **NULL** values in irrelevant fields.
  * Results in **inefficient storage**.

---

## 🔶 2. NoSQL: "Not Only SQL"

### 🧾 What is NoSQL?

* A **broad category** of databases that deviate from strict tabular structure.
* Supports **flexible schema**, high **scalability**, and **eventual consistency**.
* Now interpreted as **Not Only SQL** (i.e., supports SQL-like queries but isn't limited to RDBMS structure).

---

## 📦 3. Types of NoSQL Databases

### 3.1. 📝 Document Databases

* Store data as **documents**, usually in **JSON** or **BSON** format.
* Example document (JSON-like):

  ```json
  {
    "title": "Movie",
    "info": {
      "rating": 8.5,
      "actors": ["Actor A", "Actor B"]
    }
  }
  ```
* Documents in a collection can have **different fields**, unlike rows in SQL tables.
* **Indexing** can be done on nested fields (e.g., `info.rating`).
* ⚙️ Efficient for semi-structured and hierarchical data.

#### ✅ Use Case:

* When data structure can **vary per entity** (e.g., different user profiles).

#### Popular tools:

* **MongoDB**, **Amazon DocumentDB**

---

### 3.2. 🔑 Key-Value Stores

* Each data entry is a **(key, value)** pair.
* Efficient for **exact lookups**, like a Python **dictionary**.
* Limited for complex queries (e.g., no range queries).

#### 🔧 Implementation:

* Backed by **Hash Tables** or **Search Trees**.

#### ✅ Use Case:

* Storing session data, cache, quick user profile access.

#### Popular tools:

* **Redis**, **Memcached**, **BerkeleyDB**

---

### 3.3. 📊 Columnar Databases (Column Stores)

* Store data **column-wise** instead of row-wise.
* Traditional RDBMS store full **rows together** (good for row lookups).
* Column stores group **values of a column together** for fast **columnar access**.

#### ✅ Use Case:

* Analytical queries like:
  *"Get all users born in 1990"*

#### Variants:

* **Wide Column Stores**: Single column contains multiple data fields.
* Examples: **Apache Cassandra**, **HBase**

---

### 3.4. 🕸️ Graph Databases

* Represent data as **nodes** (entities) and **edges** (relationships).
* Best for representing **connected data**.
* Examples: social networks, maps, recommendation systems.

#### ✅ Use Case:

* Find shortest paths, mutual friends, introductions (e.g., LinkedIn).

#### Popular tools:

* **Neo4j**, **ArangoDB**, **OrientDB**

---

### 3.5. ⏱️ Time Series Databases (TSDB)

* Specialized databases for storing **time-indexed data** (e.g., sensor logs).
* Focus on **performance over time** and **aggregations**.

#### ✅ Use Case:

* Monitoring metrics, tracking website hits over time, IoT data.

#### Features:

* Efficient **time-based querying**
* Support for **data rollups** (aggregating past data, e.g., daily/monthly summaries)

#### Popular tools:

* **InfluxDB**, **Prometheus**, **RRDTool**, **OpenTSDB**, **Graphite**

---

## ⚖️ 4. SQL vs NoSQL: Core Differences

| Feature         | SQL (RDBMS)               | NoSQL                                  |
| --------------- | ------------------------- | -------------------------------------- |
| Schema          | Fixed, tabular            | Flexible, dynamic                      |
| Query Language  | SQL                       | Varies, some use SQL-like syntax       |
| Data Model      | Relational                | Document, Key-Value, Graph, etc.       |
| Scaling         | Vertical                  | Horizontal                             |
| ACID Compliance | Strong                    | Optional, often **eventual**           |
| Best For        | Structured, transactional | Big data, unstructured/semi-structured |

---

## ✅ 5. ACID Properties in Databases

| Property        | Description                                       |
| --------------- | ------------------------------------------------- |
| A - Atomicity   | All parts of a transaction succeed or none do.    |
| C - Consistency | Maintains valid state before/after a transaction. |
| I - Isolation   | Transactions appear as if run sequentially.       |
| D - Durability  | Once committed, data remains even after failures. |

---

### ⚠️ NoSQL & ACID

* Many NoSQL databases **sacrifice strict ACID** for performance/scalability.
* Use **Eventual Consistency** instead of strict consistency:

  * Data will become consistent **over time**, not instantly.
  * E.g., social media friend request updates may be delayed on different devices.

---

## 🧩 6. Eventual Consistency vs. Strict Consistency

| Feature               | Eventual Consistency             | Strict (Strong) Consistency     |
| --------------------- | -------------------------------- | ------------------------------- |
| Consistency Guarantee | Achieved **after some time**     | Immediate consistency           |
| Use Case              | Social media, caching, analytics | Banking, financial transactions |
| Performance           | High throughput, scalable        | Slower but reliable             |

---

## 🧠 7. Final Thoughts

* **NoSQL ≠ No SQL** — It means **Not Only SQL**.
* Many NoSQL systems still use **SQL-like queries**.
* Choice of DBMS depends on **application needs**:

  * Use **RDBMS** for structured data & transactions.
  * Use **NoSQL** for scalability, semi-structured or dynamic data.

---

## 📝 Summary of Learning Outcomes

| Learning Objective                                                             |
| ------------------------------------------------------------------------------ |
| Understand how SQL is tied to RDBMS and their benefits/limitations             |
| Learn alternative data models: document, key-value, column, graph, time-series |
| Understand ACID properties and their trade-off in scalable systems             |
| Grasp NoSQL's flexibility and use cases compared to traditional RDBMS          |
| Distinguish between **eventual consistency** and **strict consistency**        |

---