## Introduction
- Data refers to raw facts, figures, or symbols that represent information in a form suitable for processing, analysis, or communication. 
- It can be qualitative (descriptive) or quantitative (numeric), and it has no meaning until it is interpreted or processed.
- Data is central to how many of today’s applications and websites function. 
- Comments on a viral video, changing scores in a multiplayer game, and the items you left in a shopping cart on your favorite online store are all bits of information stored somewhere in a database.

###  Types of Data
1. Structured Data
    - Organized and stored in a fixed format (e.g., tables in databases).
    - Example: Spreadsheets, SQL databases.

2. Unstructured Data
    - Lacks a pre-defined format or structure.
    - Example: Emails, videos, social media posts.

3.  Semi-Structured Data
    - Partially organized with tags or markers.
    - Example: JSON, XML files.

### Relational Databases

- A Relational Database is a structured collection of data that stores information in tables(also called `relations`). 
- Each table has rows(`tuples`) and columns(`attributes`), and these tables can be related to each other using keys (primary and foreign keys).
- It is based on Relational Model Theory, proposed by E. F. Codd in 1970.
- The relational model’s structural elements help to keep data stored in an organized way, but storing data is only useful if you can retrieve it. 
- To retrieve information from an RDBMS, you can issue a `query`, or a structured request for a set of information. - Most relational databases use a language called `Structured Query Language` 

### Core Components of a Relational Database

| Component              | Description                                                                                 |
| ---------------------- | ------------------------------------------------------------------------------------------- |
| **Table (Relation)**   | A collection of data about a particular subject. Each table represents a real-world entity. |
| **Row (Tuple/Record)** | A single, uniquely identifiable item in a table. Each row is a record.                      |
| **Column (Attribute)** | A single field in a table; defines the type of data stored (e.g., name, date).              |
| **Primary Key**        | Uniquely identifies each row in a table. Cannot be NULL.                                    |
| **Foreign Key**        | Points to a primary key in another table to establish relationships.                        |
| **Schema**             | The structure that defines tables, columns, and relationships in a database.                |

### Data Integrity Using Constraints
Relational databases maintain data integrity using constraints:

| Constraint      | Description                                |
| --------------- | ------------------------------------------ |
| **PRIMARY KEY** | Uniquely identifies a row.                 |
| **FOREIGN KEY** | Ensures the value exists in another table. |
| **UNIQUE**      | No duplicate values allowed in the column. |
| **NOT NULL**    | Column must have a value.                  |
| **CHECK**       | Ensures values meet a specific condition.  |


### Popular Relational Database Management Systems (RDBMS)
| System         | Notes                                                 |
| -------------- | ----------------------------------------------------- |
| **MySQL**      | Open source, widely used, good for web apps.          |
| **PostgreSQL** | Advanced features, ACID-compliant, open source.       |
| **SQLite**     | Lightweight, serverless. Ideal for mobile/local apps. |
| **Oracle DB**  | Enterprise-grade, proprietary.                        |
| **SQL Server** | Developed by Microsoft, commonly used in enterprise.  |

### Advantages of Relational Databases
- Data integrity and consistency through constraints.
- Easy to query using SQL (Structured Query Language).
- Normalization reduces redundancy.
- Clear data relationships.

### Disadvantages of Relational Databases
- Complexity with Large or Unstructured Data
- Scalability Limitations (Vertical Scaling) - you need a more powerful server (CPU, RAM, etc.) rather than simply adding more machines.
- Schema Rigidity - structure (tables, columns, data types) must be defined before inserting data.

### Non-relational Databases

- A non-relational database is a database that does not use the traditional table-based relational model.
- Instead, it uses different formats like:
    - Documents
    - Key-Value pairs
    - Graphs
    - Wide-columns
- These databases are designed for flexibility, scalability, and handling large volumes of unstructured or semi-structured data.

### Types of Non-Relational Databases
#### 1. Document Stores
- Store data as documents (usually JSON or BSON).
- Documents can contain nested structures.
- Schema-less: Each document can have a different structure.
    - Example: MongoDB, CouchDB
```json
{
  "_id": "001",
  "name": "Alice",
  "email": "alice@mail.com",
  "orders": [
    {"id": 1, "product": "Book", "price": 20},
    {"id": 2, "product": "Pen", "price": 5}
  ]
}
```

#### 2. Key-Value Stores
- The simplest NoSQL model.
- Data is stored as a key → value pair.
- Fast and highly scalable.
    - Example: Redis, DynamoDB, Riak
```css
"user:001" → "{name: 'Alice', age: 25}"
```

#### 3. Column-Oriented Stores (Wide-Column)
- Data is stored in columns rather than rows.
- Good for high-performance analytical queries.
    - Example: Apache Cassandra, HBase
```sql
UserTable
| user_id | name   | email         | age |
|---------|--------|---------------|-----|
| 1       | Alice  | alice@mail.com| 25  |
```
Each row can have different columns (schema flexible).

#### 4. Graph Databases
- Use nodes (entities) and edges (relationships).
- Ideal for complex relationships between data.
    - Example: Neo4j, ArangoDB
- Used in: social networks, fraud detection, knowledge graphs.

### Popular Non-Relational Databases
| Database            | Type               | Common Use              |
| ------------------- | ------------------ | ----------------------- |
| **MongoDB**         | Document           | Web apps, CMS           |
| **Redis**           | Key-Value          | Caching, queues         |
| **Cassandra**       | Column-based       | Analytics, IoT          |
| **Neo4j**           | Graph              | Relationship modeling   |
| **Amazon DynamoDB** | Key-Value/Document | Serverless applications |


### Table Keys

- It is used to uniquely identify any record or row of data from the table. It is also used to establish and identify relationships between tables.

### Types of keys:

#### EMPLOYEE Table Example

| ID  | Name      | License_Number | Passport_Number |
|-----|-----------|----------------|------------------|
| 1   | Alice     | DL1001         | P987654321       |
| 2   | Bob       | DL1002         | P123456789       |
| 3   | Charlie   | DL1003         | P567891234       |

---

#### 1. Primary Key

- A **Primary Key** uniquely identifies each record in a table.
- It **cannot be NULL** and must be **unique**.

**Example:**  
In the EMPLOYEE table, `ID` can be the primary key since it is unique for each employee.

#### 2. Candidate Keys
- These are columns that can potentially serve as primary keys.
- All candidate keys are unique and non-null.

**Example:**
In the EMPLOYEE table, we can also select License_Number or Passport_Number as candidate keys since they are also unique for each employee.

#### 3. Alternate Key
- When a candidate key is not chosen as the primary key, it becomes an alternate key.

**Example:**
If we choose ID as the primary key, then Passport_Number and License_Number become alternate keys.

#### 4. Composite Key
- A composite key uses more than one column to uniquely identify a row.
- Useful when no single column can serve as a unique identifier.

**Example:**
If the table doesn't have a unique ID, we might use License_Number + Passport_Number together as a composite primary key.
```sql
PRIMARY KEY (License_Number, Passport_Number)
```

#### 5. Foreign Key
- A foreign key is used to link two tables.
- It refers to the primary key in another table.

**Example:**
Assume we have another table Payroll, with a column Employee_ID that references the EMPLOYEE.ID.

#### 6. Super Key
- A super key is any combination of columns that uniquely identifies a row.
- All candidate keys are super keys, but super keys can contain extra attributes.

**Example:**
 In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two employees can be the same, but their EMPLYEE_ID can’t be the same. Hence, this combination can also be a key.
