# Normal Forms

- First Normal Form (1NF): domains of all attributes must be atomic
    - 1NF removes one type of redundancy but leaves other types
- higher normal forms define stricter rules based on 
    - functional dependencies (2NF, 3NF, BCNF) <- ECE356
    - multivalued dependencies (4NF)
    - join dependencies (5NF)
    - join dependencies generalized to temporal data (6NF)

# Second Normal Form (2NF)

- a **non-prime attribute** of a relation schema $R$ is an attribute that is not a part of any candidate key of $R$

- a relation schema $R$ is in **second normal form (2NF)** with respect to a set $F$ of functional dependencies if
    - $R$ is in 1NF, and
    - no non-prime attribute is functionally determined under $F$ by any proper subset of attributes of any candidate key of $R$

**Example 1** Many departments per instructor

$instructor(\underline{\textrm{ID}}, name, salary)$

$department(dept\_name, building, budget)$

$inst\_dept(\underline{\textrm{ID}}, dept\_name)$

$F=\{\textrm{ID} \rightarrow name, salary; dept\_name \rightarrow building, budget\}$

- schemas $instructor$ and $department$ are in 2NF w.r.t. $F$ because they have only one candidate key each, and their candidate keys only have one attribute

- schema $inst\_dept$ is also in 2NF w.r.t. $F$ because it has no non-prime attribute

**Example 2** Many departments per instructor

$instructor(\underline{\textrm{ID}}, name, salary)$

$inst\_dept(\underline{\textrm{ID}}, \underline{dept\_name}, building, budget)$

$F=\{\textrm{ID} \rightarrow name, salary; dept\_name \rightarrow building, budget\}$

- schema $instructor$ remains in 2NF w.r.t $F$
- $inst\_dept$ is not in 2NF w.r.t. $F$ because $dept\_name \rightarrow building$ and yet $dept\_name$ is not a proper subset of the candidate key $(\textrm{ID}, dept\_name)$

**Example 3** One department per instructor

$inst\_dept(\underline{\textrm{ID}}, name, salary, dept\_name, building, budget)$

$F=\{\textrm{ID} \rightarrow name, salary, dept\_name; dept\_name \rightarrow building, budget\}$

- schema is in 2NF because it has only one candidate key, and this candidate key has only one attribute, but yet it permits redundancy



# Third Normal Form (3NF)

- a relation schema $R$ is in **third normal form** with respect to a set $F$ of FDs if for every dependency $\alpha \rightarrow \beta$ in $F^+$ such that $\alpha, \beta \subseteq R$, at least one of the following holds
    - $\alpha \rightarrow \beta$ is trivial (i.e., $\beta \subseteq \alpha$)
    - $\alpha$ is a superkey for $R$
    - each attribute $A$ in $\beta - \alpha$ is a prime attribute
        - contained in a candidate key for R (each attribute may be in a different candidate key)

- **Theorem:** For any relation $R$ and set of FDs $F$, if $R$ is in 3NF w.r.t. $F$ then $R$ is in 2NF w.r.t. $F$

## Algorithm

- given a relation schema $R$ and set $F$ of FDs
    - compute attribute closures for all subsets of attributes of $R$ w.r.t. $F$
    - identify all candidate keys, examine the attribute closures and look for a dependency $\alpha \rightarrow \beta$ that violates 3NF
    - if no such dependency exists then conclude that $R$ is in 3NF w.r.t. $F$
    
<img src="img/Snip20191007_114.png" width=80%/>

## Simplified Algorithm

- a simpler 3NF test exists in the *special case* when $F$ is a set of FDs over the attributes of $R$ (and no other attributes)
- given a relation schema $R$ and set $F$ of FDs
    - identify all candidate keys (e.g., using attribute closures)
    - examine each FD $\alpha \rightarrow \beta$ in $F$ and check whether that dependency violates 3NF
    - if no FD in $F$ violates 3NF then conclude that $R$ is in 3NF w.r.t. $F$
    
<img src="img/Snip20191007_115.png" width=80%/>

**Example 4** one department per instructor

$inst\_dept(\underline{\textrm{ID}}, name, salary, dept\_name, building, budget)$

$F=\{\textrm{ID} \rightarrow name, salary, dept\_name; dept\_name \rightarrow building, budget\}$

- $dept\_name \rightarrow building$ and yet $dept\_name$ is not a superkey, and $building$ is not contained in the candidate key $\{\textrm{ID}\}$ 
- not in 3NF

**Example 5** one department per instructor

$instructor(\underline{\textrm{ID}}, name, salary)$

$department(\underline{dept\_name}, building, budget)$

$F=\{\textrm{ID} \rightarrow name, salary; dept\_name \rightarrow building, budget\}$

- both schemas are in 3NF wrt $F$ because left side of any FD is a superkey for the relation to which the FD pertains

# Boyce-Codd Normal Form (BCNF)

- a relation schema $R$ is in BCNF with respect to a set $F$ of FDs if for every dependency $\alpha \rightarrow \beta$ in $F^+$ such that $\alpha,\beta \subseteq R$, at least one of the following holds
    - $\alpha \rightarrow \beta$ is trivial (i.e., $\beta \subseteq \alpha$)
    - $\alpha$ is a superkey for $R$
- BCNF is known as "3.5NF", only slightly stronger than 3NF

- **Theorem** For any relation schema $R$ and set of FDs $F$, if $R$ is in $BCNF$ w.r.t. $F$, then $R$ is in 3NF w.r.t $F$

## Algorithm

- Given a relation schema $R$ and set $F$ of FDs
    - compute attribute closures for all subset of attributes of $R$ w.r.t. $F$
    - examine the attribute closures and look for a dependency $\alpha \rightarrow \beta$ that violates BCNF
    - if no such dependency exists then conclude that $R$ is in BCNF w.r.t. $F$

<img src="img/Snip20191008_127.png" width=80%/>

## Simplified Algorithm

- for *special case* when $F$ is a set of FDs over the attributes of $R$ and no other attributes
- Given a relation schema $R$ and set $F$ of FDs
    - examine each FD $\alpha \rightarrow \beta$ in $F$ and check whether that dependency violates BCNF
    - if no dependency in $F$ violates BCNF then conclude that $R$ is in BCNF w.r.t. $F$

<img src="img/Snip20191008_128.png" width=80%/>

# Summary

- 1NF prohibits non-atomic domains
- 2NF prohibits FDs of non-prime attributes on parts of candidate keys
- 3NF prohibits transitive FDs of non-prime attributes on candidate keys
- BCNF prohibits all FDs that might lead to redundancy
- 4NF and higher normal forms deal with redundancy that cannot be captured at all using FDs