## Normalization

### 2NF - Second Normal Form
- It should be in the `1nf`
- Check the different anomalies: `Deletion`, `Insertion`, `Update`
    - `Deletion Anomaly`:
        > Deletion of one thing, leads to the unintentional loss of data.
    - `Insertion Anomaly`:
        > Data can't be inserted because of lack of necessary information.
    - `Update Anomaly`:
        > A fail in the update will cause inconsistent data in the database.

> It says, each non-key attribute must depend on the entire primary key.

- From the table
    - If you know the `reader_username`, then you can determine the `title`, `first name`, `last name`. In fact, all information about the reader.
    - If you know the `book_isbn`, then you can determine all information about that book

When we separate into many tables, we have
- `book`: with primary key `book_isbn`
- `reader`: with primary key `reader_username`
- `myread`: with surrogate primary key `id`.

### 3NF - Third Normal Form
> It says a non-key attribute shouldn't depend on another non-key attribute, but on the entire primary key and nothing but the entire primary key.

This normal form tries to avoid `transitive dependency`. When a non-key attribute depends on another non-key attribute.

- For example, you have a table `employee_status
    - `emp_id PRIMARY KEY`
    - `skill_level`
    - `seniority`
```
emp_id   skill_level   seniority
1           10           senior
2           10           senior
3           5            mid-level
```

**employ_status**

|emp_id|skill_level|
|------|-----------|
|1     |  10       |
|2     |  10       |
|3     |  5        |


**skill_seniority**

|skill_level|seniority|
|-----------|---------|
|1,2,3      | Beginner |
|4,5,6      | Mid-level|
|7,8,9,10   | Senior   |

- In our `myread` table, we have
    - `read_status` That will be either `pending`, `reading`, `done`.
    - `percentage_read`: This describe the percentage read. 0%, 10%, 100%

**read_status_percentage**

|percentage_range|read_status|
|----------------|-----------|
|[0,0]           | pending   |
|[1,99]          | reading   |
|[100,100]       | done      |

### Ideas for future improvement -> Exercise
- let the system auto-calculate this percentage based on the book's page count and the reader read page count.

## Foreign Key
- We need a column from the book to represent books in the `myread`
- We need a column from the reader to represent readers in the `myread`.

**NB**: The best candidate for foreign is the primary key. It could be a column with the UNIQUE constraint.

- `book` -> `book_isbn`
- `reader` -> `reader_username`

## Define data types and business rules (constraints)

**book**

- `isbn`: CHAR(13) PRIMARY KEY CHECK(LENGTH(isbn) = 13 AND isbn::BIGINT = isbn::BIGINT)
- `title`: VARCHAR(50) NOT NULL
- `edition`: INT
- `description`: TEXT
- `page_count`: INT NOT NULL
- `category`: ENUM('programming', 'art', 'politics', 'others')
- `published_date`: DATE NOT NULL
- `publisher`: VARCHAR(50) NOT NULL
- `authors`: VARCHAR(50) ARRAY
- `lang`: VARCHAR(10) NOT NULL
- `format`: VARCHAR(10) CHECK(format IN ('ebook', 'hardcover'))
- `read_estimated_time_in_minutes`: INT GENERATED ALWAYS AS ((page_count * 120)/60) VIRTUAL | STORED

**reader**

- `username`: VARCHAR(50) PRIMARY KEY
- `title`: ENUM(Mrs, Mr, Dr, Ms, Miss)
- `first_name`: VARCHAR(100) NOT NULL
- `last_name`: VARCHAR(100)

**status_percent**

- `read_status`: VARCHAR(10) PRIMARY KEY
- `percentage_read`: INT4RANGE NOT NULL


**myread**

- `book_isbn`: CHAR(13) FOREIGN KEY REFERENCES book(isbn)
- `reader_username`: VARCHAR(50) FOREIGN KEY REFERENCES reader(username)
- `start_read_date`: DATE 
- `end_read_date`: DATE
- `percentage_read`: INT

### Exercise - Constraints for the myread table
- The `end_read_date` should be ahead of `start_read_date` in time
- The `percentage_read` should be between `0` and `100` inclusively
- `percentage_read` shouldn't be `0` if `start_read_date` is set
- `percentage_read` should be `0` if `start_read_date` is not set
- `percentage_read` should be `100` when `end_read_date` is set.
- `percentage_read` shouldn't be `100` if `end_read_date` is not set.

