# SQL Keys
## Primary Key
**Definition:**
A primary key is a unique identifier for a record in a database table. It ensures the uniqueness and integrity of the data within the table.

Example:
Consider a `Students` table:

| StudentID (Primary Key) | Name  | Age | Department   |
|-------------------------|-------|-----|--------------|
| 1                       | John  | 20  | Computer Sci |
| 2                       | Alice | 22  | Physics      |
| 3                       | Bob   | 21  | Chemistry    |



Additional Notes:

* Primary key values must be unique and cannot be NULL.
* Each table in the database should have a primary key to uniquely identify records.

## Foreign Key
**Definition:**
A foreign key is a field in a table that is the primary key in another table. It establishes a link between data in two tables, creating a relationship between them.

Example:
Consider a Courses table with a foreign key referencing DepartmentID from the Departments table:


| CourseID (Primary Key) | CourseName | Instructor    | DepartmentID (Foreign Key) |
|------------------------|------------|---------------|----------------------------|
| 101                    | Math 101   | Prof. Johnson | 1                          |
| 102                    | Physics 1  | Prof. Smith   | 3                          |
| 103                    | Chem 101   | Prof. White   | 2                          |

Additional Notes:

* Foreign keys ensure referential integrity between tables.
* They help maintain relationships and enforce constraints between tables in a database.

## Candidate Key
**Definition:**
A candidate key is a set of one or more fields that can uniquely identify a record in a table. Each candidate key satisfies the uniqueness requirement.

Example:
Consider an Instructors table:

| InstructorID (Candidate Key) | Name          | Department   |
|------------------------------|---------------|--------------|
| 201                          | Dr. Johnson   | Computer Sci |
| 202                          | Prof. White   | Chemistry    |
| 203                          | Dr. Lee       | Physics      |

Additional Notes:

* Tables may have multiple candidate keys, one of which is chosen as the primary key.
* Candidate keys are essential for maintaining the uniqueness of records in a table.

## Super Key
**Definition:**
A super key is a set of one or more keys (attributes) that can uniquely identify a record in a table. It may contain additional attributes that are not required for uniqueness.

Example:
Consider a Departments table:


| DepartmentID (Super Key) | DepartmentName  | Dean         |
|---------------------------|-----------------|--------------|
| 1                         | Computer Sci    | Dr. Adams    |
| 2                         | Chemistry       | Dr. Brown    |
| 3                         | Physics         | Dr. Clark    |

Additional Notes:

* Super keys are used in the design phase to identify potential candidate keys.
* A super key may include the primary key along with additional attributes.

## Composite Key
**Definition:**
A composite key is a combination of two or more columns used to uniquely identify a record in a table. The combination of these columns must be unique within the table.

Example:
Consider a Enrollments table with a composite key (StudentID and CourseID):


| StudentID (Composite Key) | CourseID (Composite Key) | Grade |
|----------------------------|---------------------------|-------|
| 1                          | 101                       | A     |
| 2                          | 102                       | B     |
| 1                          | 103                       | C     |

Additional Notes:

* Composite keys are used when a single column does not provide uniqueness.
* The combination of columns in a composite key must be unique for each record.

# Normalization

Normalization is the process of organizing and structuring a relational database to eliminate redundancy and dependency. In this lesson, we will understand the concept of normalization using Excel tables, starting with an initial unnormalized database and progressing through the normalization steps.

Normalization in SQL involves organizing a database schema to reduce redundancy and improve data integrity. The normalization process is divided into several normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF), and each normal form addresses specific issues related to data redundancy and dependency. Here's a step-by-step guide, explaining each normal form with definitions and examples.

### Step 1: First Normal Form (1NF)

Definition:

A table is in 1NF if it has no repeating groups and all entries in a column are atomic (indivisible).

Atomic Values: In the context of 1NF, atomic values mean that each column in a table should store only one type of data. 

Example:

Consider an unnormalized table Students:


| StudentID | Courses                   |
|-----------|---------------------------|
| 1         | Math, Physics             |
| 2         | Chemistry, Biology        |
| 3         | English                   |

To convert it to 1NF, split the repeating groups into separate rows:


| StudentID | Course       |
|-----------|--------------|
| 1         | Math         |
| 1         | Physics      |
| 2         | Chemistry    |
| 2         | Biology      |
| 3         | English      |

### Step 2: Second Normal Form (2NF)

Definition:

A table is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the primary key.

A functional dependency is fully functional if it is not dependent on any subset of the candidate key, but only on the entire candidate key. In simpler terms, in a table with composite keys (more than one column acting as a primary key), an attribute is fully functionally dependent on the composite key if it depends on all the columns in that key, and not just a part of it.

Example:

Consider a table StudentsCourses:


| StudentID | Course    | Instructor    |
|-----------|-----------|---------------|
| 1         | Math      | Prof. A       |
| 1         | Physics   | Prof. B       |
| 2         | Chemistry | Prof. C       |

To convert it to 2NF, split it into two tables: Courses and Instructors.


| CourseID | Course    |
|----------|-----------|
| 1        | Math      |
| 2        | Physics   |
| 3        | Chemistry |

| InstructorID | Instructor |
|--------------|------------|
| 1            | Prof. A    |
| 2            | Prof. B    |
| 3            | Prof. C    |

### Step 3: Third Normal Form (3NF)

Definition:

A table is in 3NF if it is in 2NF and no transitive dependencies exist.

A transitive dependency occurs when one non-key attribute in a table depends on another non-key attribute, which in turn depends on the primary key. In other words, if A → B (A determines B) and B → C (B determines C), then A → C (A determines C) is a transitive dependency.

Example:

Consider a table StudentsDepartments:


| StudentID | Department  | Location    |
|-----------|-------------|-------------|
| 1         | IT          | Building A  |
| 2         | Physics     | Building B  |

To convert it to 3NF, split it into two tables: Departments and DepartmentLocations.


| DepartmentID | Department |
|--------------|------------|
| 1            | IT         |
| 2            | Physics    |

| DepartmentID | Location    |
|--------------|-------------|
| 1            | Building A  |
| 2            | Building B  |

### Step 4: Boyce-Codd Normal Form (BCNF)

Definition:

A table is in BCNF if it is in 3NF and every determinant is a candidate key.

A determinant is an attribute or a set of attributes in a relation schema that uniquely determines other attributes' values. In simpler terms, a determinant is a column or a group of columns in a table that uniquely identifies a row in that table.

Example:

Consider a table EmployeesProjects:


| EmployeeID | ProjectID | ProjectName   |
|------------|-----------|---------------|
| 1          | 101       | Project A     |
| 2          | 102       | Project B     |
| 3          | 101       | Project A     |

To convert it to BCNF, split it into two tables: Employees and Projects.


| EmployeeID | EmployeeName |
|------------|--------------|
| 1          | John         |
| 2          | Alice        |
| 3          | Bob          |

| ProjectID | ProjectName   |
|-----------|---------------|
| 101       | Project A     |
| 102       | Project B     |

### Additional Normal Forms (4NF, 5NF, etc.)

Beyond 3NF, higher normal forms like 4NF, 5NF, and so on, involve more complex scenarios and are usually addressed based on specific requirements and domain knowledge. The general normalization process aims to minimize redundancy and ensure data integrity in SQL databases.