# Intro

- **relations**: used to represent objects and relationships between objects
- relations are tables with rows and columns
    - rows may represent entities and relationships
    - columns represent attributes
- a relation is a *mathematical object*, and a table is its *physical embodiment*

# Attribute Types

- the *set of allowed values for each attribute* is called the **domain** of the attribute
- attribute values are usually requred to be **atomic**, which means they are indivisible
- the special value **null** is a member of every domain, and is used to represent *missing or unknown data*
    - use null values judiciously because they lead to a number of complications

# Relation Schema and Instance

- Let $A_1, A_2, ..., A_n$ denote attributes
- Let $D_1, D_2, ..., D_n$ denote domains
- $R(A_1, A_2, ..., A_n)$ denotes a **relation schema** over these attributes
    - e.g., `instructor(ID, name, dept_name, salary)`
    - the order of elements in the tuple does not matter
    - think of it as class in OOP
- a **relation** `r` conforming to schema `R`, denoted as `r(R)`, is a subset of $D_1 \times D_2 \times ... \times D_n$
    - thus, a relation is a set of n-tuples ($a_1, a_2, ..., a_n$) where each ai is in Di
    - think of it as objects in OOP
- an element `t` of `r` is a **tuple** (n-tuple), and corresponds to a row in a table
- a **relation instance** refers to the concrete values of a relation

# Database

- a database typically comprises many relations
- normalization theory deals with how to design good relational schemas that satisfy a very precise notion of "goodness"

# Keys

- Let $R$ be a relation schema and let $ K \in R $ (K is a subset of R's attributes)
- **Superkey** and **Candidate key**
    - $K$ is a **superkey** of $R$ if values for $K$ are sufficient to identify a unique tuple of each possible relation $r(R)$
        - e.g., `{ID}` and `{ID, name}` are both superkeys of instructor
    - Superkey $K$ is a **candidate key** if $K$ is minimal
        - e.g., `{ID}` is a candidate key for instructor
    - one of the candidate keys is selected to be the **primary key**
- **Foreign key** constraint: an attribute value in one relation must appear in another relation
    - **referencing relation** contains a **foreign key**
    - **referenced relation** contains a **referenced key** (usually the primary key)
    - ![Relation Schema Diagram](attachment:Snip20190909_12.png)
    - ^ relation schema diagram, different from entity relation diagram
        - each box is a relation
        - primary key is underlined, primary key can contain multiple attributes
    - e.g., attribute *ID* in *takes* (referencing relation) is a *foreign key* that references *ID* in *student* (referenced relation)
