In reality the data we encounter will be complex, even redundant. This is where the study of data modeling techniques and database design come in. 

![](images/Screenshot%202022-10-31%20180909.png)

We can see that Descartes has two rows because he spent his life in both France and Netherlands. 

if at a later point we decide that we wish to classify him with a different skill, we would have to update both his rows 

Wouldn’t it be saner to have a separate table for skills?

and allow the records that share the same skill to refer to this table? 

This process of breaking down a raw database into logical tables and removing redundancies is called Normalization.

There are levels of normalization called normal forms that dictate how to achieve the desired design.

There are five accepted normal forms 
They range from first normal form 1NF to fifth normal form 5NF.
These forms are progressive in nature, meaning that a design in 3NF is also 1NF and 2NF compliant. 

working developers usually restrict themselves to 3NF or 4NF in most cases.

# Atomicity

Let us take the case of BASIC, which was
designed by John Kemeny and Thomas Kurtz. 

You can immediately see that it would be difficult to write a query to retrieve this record based on the author field. 
“Kemeny, Kurtz” or “Kurtz, Kemeny” or even “Kemeny & Kurtz,”

The correct solution is to redesign the table structure to make all field values atomic.

Atomicity of values means that every intersection of a row and column must contain a single, indivisible value.

start thinking of changing your table structures.

# Repeating Groups

Another simple (but ultimately wrong) approach that comes to mind is to split the author field into two parts – author1 and author2. 

![](images/Screenshot%202022-10-31%20182402.png)

## Can you spot the problem that will arise from this design decision?

This imposes an artificial constraint on how many authors a language can have. 

This kind of design is referred to as a repeating group and must be actively avoided.

# Splitting the Table

The correct design to remove the problems listed above is to split the table into two – one holding the author details and one detailing the language.

![](images/Screenshot%202022-11-01%20085359.png)

![](images/Screenshot%202022-11-01%20085447.png)

Once you have removed the non-atomicity of fields and repeating groups along with assigning unique id’s to your tables, your table structure is now in the first normal form

The author table’s language_id field, which refers to the id field of the language table, is called a foreign key constraint

```
tesdb=# CREATE TABLE newlang_tbl
        (
            id INTEGER NOT NULL PRIMARY KEY,
            language VARCHAR(20) NOT NULL,
            year INTEGER NOT NULL,
            standard VARCHAR(10) NULL
            );

tesdb=# CREATE TABLE authors_tbl
        (
            author_id INTEGER NOT NULL,
            author VARCHAR(25) NOT NULL,
            language_id INTEGER REFERENCES newlang_tbl(id)
            );
```

You can only create a foreign key reference as a primary or unique key


inserting a row in the author’s table that does not yet have a language entry would also result in an error

```
tesdb=# INSERT INTO authors_tbl
        (author_id, author, language_id)
        VALUES
        (5, 'Kemeny', 5)
```


```
tesdb=# INSERT INTO newlang_tbl
        (id, language, year, standard)
        VALUES
        (5, 'BASIC', 1964, 'ANSI');

tesdb=# INSERT INTO authors_tbl
        (author_id, author, language_id)
        VALUES
        (5, 'Kemeny', 5);
```

Referential integrity is a key benefit of good relational database design. Since it applies to related entities, it ensures that the values of these remain in sync.


```
tesdb=# 

INSERT INTO newlang_tbl
    (id, language, year, standard)
    VALUES
    (1, 'Prolog', 1972, 'ISO');

INSERT INTO newlang_tbl
    (id, language, year)
    VALUES
    (2, 'Perl', 1987);

INSERT INTO newlang_tbl
    (id, language, year, standard)
    VALUES
    (3, 'APL', 1964, 'ANSI');

INSERT INTO newlang_tbl
    (id, language, year)
    VALUES
    (4, 'Tcl', 1988);

INSERT INTO authors_tbl
    (author_id, author, language_id)
    VALUES (6, 'Kurtz', 5);

INSERT INTO authors_tbl
    (author_id, author, language_id)
    VALUES (1, 'Colmerauer', 1);

INSERT INTO authors_tbl
    (author_id, author, language_id)
    VALUES (2, 'Wall', 2);

INSERT INTO authors_tbl
    (author_id, author, language_id)
    VALUES (3, 'Ousterhout', 4);

INSERT INTO authors_tbl
    (author_id, author, language_id)
    VALUES (4, 'Iverson', 3);
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```