# **Lesson 3 - Relational Data Model**

## KEY TOPICS

- Relational Data Model  
- Properties of a Relation  
- Classification of Attributes  
- Normalization  
- Modification Anomalies  
- Normalization Terminologies  
- Process of Normalization  

---

## RELATIONAL DATABASE

**Relation** – A table used to represent data and their relationships in the relational model.  
**Attribute** – A property that defines an entity and corresponds to a column in a table.  
**Tuple** – A row in a relation that contains a set of related data values representing a real-world entity or relationship.  

---

### Properties of a Relation

- Data is stored in rows (tuples) and columns (attributes).  
- Each column (attribute) has a unique name and atomic values.  
- Each row (tuple) represents a unique record; no duplicate rows.  
- Row and column order does not matter.  
- Each attribute has a defined domain (valid data types).  
- `NULL` values can represent missing data.  
- Has a **relation schema** defining its name and attributes.  
- Typically includes a **primary key** to uniquely identify records.  

---

## CLASSIFICATION OF ATTRIBUTES

### Based on Their Role as Keys

- **Super Key** – Any set of attributes that uniquely identifies a row.  
- **Candidate Key** – A minimal super key.  
- **Primary Key** – The chosen candidate key.  
- **Foreign Key** – Refers to a primary key in another table.  
- **Alternate Key** – Candidate key not chosen as primary.  
- **Composite Key** – A primary key consisting of multiple attributes.  

### Based on Data Type and Constraints (Domain)

- Valid data types: numeric, character, boolean, date, time.  
- Domain constraints ensure consistent and correct data.  

### Based on Function in Relationships (Referential Integrity)

- Attributes maintain consistency between related data.  
- Foreign key in one table refers to a primary key in another.  
- Prevents orphaned records, enforces consistency.  

---

## DATABASE NORMALIZATION

**Definition**  
The process of organizing attributes in a database to reduce or eliminate data redundancy.  

**Purpose**  
Refines the initial data model to ensure unambiguous, intended results.  
Often involves splitting tables and linking them for meaningful queries.  

---

### WHY NORMALIZE?

Normalization eliminates anomalies and improves data integrity:

- **Insertion Anomalies** – Cannot insert data without additional unnecessary data.  
- **Deletion Anomalies** – Loss of data due to deletion of related data.  
- **Update Anomalies** – Inconsistent data after partial updates.  

---

## TERMINOLOGIES

### Key Database Concepts

- **Data Redundancy** – Repetition of data across the database.  
- **Anomalies** – Issues in insert, update, or delete operations.  
- **Functional Dependency** – One attribute’s value determines another (e.g., ISBN → Title).  
- **Determinant** – Attribute(s) that determine another.  
- **Partial Dependency** – Non-key attribute depends on part of a composite key.  
- **Full Dependency** – Depends on the entire composite key.  
- **Transitive Dependency** – A → B and B → C implies A → C.

---

### Normalization & Normal Forms

- **Unnormalized Form (UNF)** – Contains repeating groups.  
- **First Normal Form (1NF)** – No repeating groups; single-valued attributes.  
- **Second Normal Form (2NF)** – 1NF + no partial dependencies.  
- **Third Normal Form (3NF)** – 2NF + no transitive dependencies.  
- **Boyce-Codd Normal Form (BCNF)** – Every determinant is a super key.  
- **Fourth Normal Form (4NF)** – BCNF + no multivalued dependencies.  
- **Fifth Normal Form (5NF)** – 4NF + no unnecessary decompositions.

---

### Decomposition

- **Decomposition** – Splitting a table into smaller ones.  
- **Dependency Preserving** – Keeps original functional dependencies intact.  
- **Lossless Decomposition** – Allows full reconstruction of original data without errors.  

---

## PROCESS OF NORMALIZATION

### UNF → 1NF

- Remove repeating groups.  
- Ensure single values at each row-column intersection.  
- Move repeating data to new table with a primary key.

### 1NF → 2NF

- Eliminate partial dependencies.  
- Move partial dependencies to new table.  
- Ensure all attributes fully depend on the entire primary key.

### 2NF → 3NF

- Eliminate transitive dependencies.  
- Move transitively dependent attributes to a new table with their determinant.

### Beyond 3NF

- **BCNF** – All determinants must be super keys.  
- **4NF & 5NF** – Eliminate multivalued and join dependencies.

---

### Considerations

- **Normalization** improves integrity and reduces redundancy but increases table count and complexity of queries.  
- **Denormalization** may be used to improve performance in some systems by reintroducing controlled redundancy.
