##### BJARKE BRODIN - INDBS 2020

# The Relational Database Model

The relational model, invented by Edgar F. Codd, is a formal data model derived from set theory and first order predicate logic. The idea of the relational data model is that we organize data in relations: <code>Relation(Attribute, Attribute)</code> - and then use set operations to perform queries on them.


## Basics of the Relational Data Model

### Relation

A relation is just a set of tuples of equal size. To achieve semantics we name the relation, and each index of its tuples. Example: <code>Car(Manufacturer, Model, Year)</code>. We then call the named indices of the tuples <em>attribute types</em>.

### Domain

A domain specifies a range of valid values for an attribute type. We can use datatypes and additional constraints to specify domains for attribute types. Formally we can say that a relation of attribute types a1,a2,...,an is a subset of the cartesian product domain(a1) x domain(a2) x ... x domain(an)

### Superkey

A subset of attribute types of a relation R for which no two tuples in R should have the same combination of values. Formally a set of attribute types A is a superkey iff. for all tuples a,b in R : A(a) != A(b).

### Key

A key of a relation R, is a set of attribute types K of R, such that K is a superkey, and the removal of any attributy type from K would remove the superkey property (a minimal superkey).

### Primary Key

A relation R may have many keys, one of which is designated the primary key, the remaining keys we call candidate or alternative keys.

### Foreign Key

A foreign key of relation R, is a set of attribute types A such that A is the primary key of a different relation P. Additionally the domains of the attribute types of A must correspond to the domains of the primary key of P. Lastly any values of A occurring in R must also occur in the columns of P, unless the values of A are NULL.

### Domain Constraint

The value of each attribute type A must be single-valued and from the domain domain(A).

### Key Constraint

Every relation has a key that allows unique identification of its tuples

### Entity Integrity Constraint

The attribute types that make up the primary key should always satisfy a NOT NULL constraint

### Referential Integrity Constraint

A foreign key has the same domain as the primary key to which it is referring, and occurs either as an existing value of the primary key or as null.

## Normalization

If we aren't careful about the way we design our relations, we might experience anomalous behaviour during some operations.

An <em>insertion anomaly</em> can occur if an insertion can result in wrong data, consider for example the relation Address(Street,Zip,City) - if we aren't careful we might allow the insertion of two different cities to the same value for zip - this is an insertion anomaly.

An <em>deletion anomaly</em> may occur if the deletion of one type of data results in anomalous behaviour. Consider again the above example, if we were to delete the only address of a particular zipcode, then we would lose all our knowledge of that zipcode, even though we only wanted to delete the address.

An <em>update anomaly</em> may occur when we update data, consider again the example of zip and city, if the name of a city were to change, we would need to look for all rows with that zip, and update the city name, this is both error-prone and redundant.

### Functional dependency

A functional dependency X -\> Y exists iff. the value of X implies the value of Y.

### Full functional dependency

We say that a functional dependency X -\> Y is full iff. the removal of any attribute type from X also invalidates the dependency. If it is not full we say that it is partial.

### Transitive functional dependency

We say that a functional dependency X -\> Y is transitive iff. there exists a set of attribute types Z such that Z is neither a key nor a subset of any key, and X -\> Z and Z -\> Y hold.

### Trivial functional dependency

For a functionnal dependency X -\> Y, we say that the dependency is trivial if Y is a subset of X.

### Multi-valued functional dependency

We say that there is a multi-valued funcitonal dependency X --\> Y iff. each value of X implies exactly a set of Y values independently of other attribute types. Consider the relation <code>Course(Name,Instructor,Textbook)</code> and the assumptions that each course may have multiple instructors and multiple textbooks - additionally we initially say that the whole relation is the primary key. We insert some data and get the table below.

| Name | Instructor | Textbook |
| --- | --- | --- |
| Algorithms | Husfeldt | Sedgewick |
| Algorithms | Husfeldt | Toolbox |
| Algorithms | Jacob | Sedgewick |
| Algorithms | Jacob | Toolbox  
 |

  
Notice that each value of Name exactly determines the set {Sedgewick, Toolbox} for values of Textbook, independently of the value of Instructor. Thus there is a multi-valued functional dependency from Name to Textbook.

  

Notice also that each value of Name exactly determines the set {Husfeldt, Jacob} for values of Instructor, independently of the value of Textbook.

### Prime attribute type

A prime attribute type is an attribute type that is part of a key.

### 1NF

We say that a relation R is in 1NF iff. there are no multi-valued or composite attribute types.

### 2NF 

We say that a relation R is in 2NF if it is in 1NF and every non-prime attribute type A in R is fully functionally dependent on any key of R.

### 3NF

We say that a relation R is in 3NF if it is in 2NF and no non-prime attribute type of R is transitively dependent on the primary key.

### BCNF

A relation R is in BCNF if it is in 3NF and for each of its nontrivial functional dependencies X -> Y, X is a superkey.

### 4NF

A relation R is in 4NF if it is in BCNF and for each of its non-trivial multi-valued dependencies X --> Y, X is a superkey.

## ER to Relational Mapping

ER | Relation
-|-
Entity Type | Relation
Entity | Tuple
Attribute Type | Column name
Attribute | Cell

### 1:1 or 0..1:1 relationships
Create a table for each entity type, in case of 1:1 consider if a merge makes sense (normalization constrains this often), otherwise use unique constraints and foreign keys to connect relations.

### M:N relationships
Create a join table, carefully consider key constraints
