# Relational Model

- DB consists of several **tables** (i.e. **relations**)

```
Example:

Customer:
CustId, Name, Street, City, State

Account:
AccountNum, Balance

Depositor:
CustID, AccountNum
```

- Columns in the table are named by **attributes**

- Each attribute has an associated **domain**\
    (set of allowed values, e.g. Customer.State: {CA, NY, WA, ...})

- Data in a table consists of a set of **rows** (**tuples**)\
    (providing values for the attributes)

### Relational Model Example

![](images/2022-10-06-22-23-14.png)

### Relational Schema

> "think of it as a type declaration - i.e. just the type of the data"

Useful to ENFORCE data structure to keep database consistently structured and "clean"

Consists of:
- Relation name

- Set of attributes

- Domain of each attribute

- Integrity constraints

`e.g. CUSTOMER(CustID, Name, Street, City)`

### Attribute Types

Each attribute of a relation has a
- Name
- **Domain**: Set of allowed values

Attribute values are (normally) required to be atomic; that is indivisible (i.e. PRIMITIVE)

- Somtimes, special value **null** is considered a member of EVERY domain

### Relational Instance

> "current content of the relation"

Consists of 
- set of rows (tuples)
- over the attributes
- holding values from their domains

![](images/2022-10-06-22-27-48.png)

### Relations are Unordered

Tuples are not considered to be ordered, even though they APPEAR SO when displayed in tabular form

> "underlying system has no guarantee. The output is non-deterministic"


### Tuples: Some Notation

- **Component values/coordinates** of a tuple t: $t(A)_j$\
    Value of attribute $A_i$ for tuple $t$

- **Subtuple** of a tuple t: $t(A_i, A_j, ..., A_k)$\
    The subtuple of $t$ containing the values of attributes $A_i, A_j, ..., A_k$

> Attribute / tuple values are generally assumed to be ordered according to schema. Think of it as the schema being rigid and set while the data that fills in that mold is fluid and dynamic and unpredictible. 

![](images/2022-10-06-22-46-01.png)

```
t = <4, "Fred Flintstone", "First Av", "SD">

t(Name) = "Fred Flinstone"

t(Street) = "First Av"
```

## Database

Consistts of multiple relations

Information about an application is broken up into parts\
Each relation stores one part of the information

e.g. separation of bank app info into relations for:
- Account
- Depositor
- Customer

Q: Why not just a single relation?

A: possible, but not desirable.

Issues
- repetition of info
- need for null values

More details into evaluating schemas as being "well designed" or "not well designed". + what can be done to turn a poorly designed schema into a good one.

There is actually been a mature (i.e. established) theory behind all of this
- like specific formalized definitions for good/bad design
- like algorithms to transform bad designs --> good designs

> "Extremely useful skillset. Also happens to be lucrative. Usually called in as a DB consultant to clean house on someone else's spaghetti code"

## Relational Integrity Constraints 

> "First class citizens!"
 
- **Constraints**: conditions that MUST HOLD on all valid relation instances of a database

Common types
- Key constraints
- Entity integrity constriants
- Refential integrity constraints

### Key Constraints

- **Superkey** of a relation R: A set of attributes $SK \in R$ s.t. NO TWO tuples in ANY VALID RELATION INSTANCE r(R) will have the same value for SK. (i.e. $\forall t1, t2 \in r(R), t1(SK) \neq t2(SK)$) (i.e. uniuely identifies the tuple)

- **Key** of a relation R: A "minimal" superkey; that is , a superkey K s.t. removal of ANY attribute from K results in a set of attributes that is NOT a superkey (i.e. no longer uniquely identifies the tuple)

- If a relation has SEVERAL **candidate keys**, one is chosen arbitrarily to be the **primary key** (typically underlined)

![](images/2022-10-07-00-34-55.png)

#### A more realistic example of keys

Schema:

![](images/2022-10-07-00-35-25.png)

Instantiation:

![](images/2022-10-07-00-37-26.png)

## Entity Integrity

**primary key attributes** (PK) of each relation schema R in S CANNOT have null values in ANY tuple since PK vals are used to ID individual tuples

$t(A) \neq \text{null}$ for any tuple $t$ in a valid instance of R, where $A \in PK$


> NOTE: other attributes of R may be similarly constrained to disallow null values even if they're not members of the primary key

## Referential Integrity

- involves 2 relations of the database (as opposed to the single relation constraints prior)

- used to specify a **relationship** among tuples in 2 relations:
  - **referencing relation**
  - **referenced relation**

- Tuples in the **referencing relation** $R_1$ have attributes FK (**foreign key** attributes)

- FK reference the attributes PK (**primary key** attributes) of the **referenced relation** $R_2$

$\forall t_1 \in R_1$ where $t_1$ is said to **referene** a tuple $t_2 \in R_2$ ==> $t_1(FK) = t_2(PK)$

- referential integrity constraint can be displayed in a relational db schema as a directed arc from $R_1.FK$ to $R_2.PK$

#### Example

![](images/2022-10-07-00-44-58.png)

### Statement of the constraint

The value in the foreign key column(s) FK of the **referencing relation** $R_1$ can be either
1. a value of a primary key PK in the **referenced relation** $R_2$ 
2. OR null
    - in this case, the FK in $R_1$ should NOT intersect its own primary key (or else, ENTITY integrity is violated)

## Other types of constraints

More sophisticated and depend on application on hand

- **Semantic Integrity Constraints**: based on application semantics and cannot be expressed by the model per se

#### Example

"the max no of hours per employee for all projects they work on is 56hrs/week"

- constraint speciic language may have to be used to express these

- SQL-99 sllows triggers and ASSERTIONS to support some of these