# Tables

The only data structure in relational databases is a table. All data are organized as tables. Tables cannot be nested. Tables have named columns. Each column has a datatype.

## Schema
A schema is a set of logically related tables, their definitions, including data integrity constraints.

## "Scholarly" Terminology 
* Tables = relation
* Column = attribute
* Datatype = domain
* Row = tuple
* Field = attribute value

## Relations
Relational databases come from 19th-century set theory concepts. 
In that theory, a *relation* is a subset of a cartesian product of several sets.
Operations on such sets yield nontrivial insights.

## First normal form
 https://en.wikipedia.org/wiki/First_normal_form

* All data are in relational tables
* No repeated columns 
* No value in the table can contain another table (tables are not nested)




## Second and Third Normal Form

* https://en.wikipedia.org/wiki/Second_normal_form
* https://en.wikipedia.org/wiki/Third_normal_form
* All secondary attributes apply to the entity itself and to the whole entity
* "All attributes describe the key, the whole key, and nothing but the key"

## Data Types
https://dev.mysql.com/doc/refman/8.0/en/data-types.html

*  `int [unsigned]`, `smallint [unsigned]`, `tinyint`, `bigint`
* `char(n)`, `varchar(n)`
* `decimal(m, n)` same as `numeric`
* `enum`
* `float`, `double` - don't use in primary keys
* `date`


## Entity Integrity


Entity integrity is a set of database design principles and practices to ensure 1:1 correspondence between real-world entities and their digital representations in the database. Entity integrity may require a complex set of enterprise rules.

1. Each table represents a well-defined entity type from the real world.  We reflect this in the table name.  The name of the table reflects the entity class represented by each row in the table.
2. For each entity type, enforce 1:1 correspondence between the real-world entity and its representation in the table. How can you do it?
3. In the real-world, we need to permanently associate a persistent identifier with each entity of the class. The database cannot do it by itself.
4. The database can use the permanent identifier to enforce uniqueness in the table using a uniqueness constraint.
5. A **primary key** is a unique, non-nullable index that is designated as the primary way to identify entitites in a table. Each tables must have a carefully chosen primary key. 
6. Secondary unique indexes can be nullable.





In [1]:
import datajoint as dj

In [3]:
schema = dj.Schema('test')

In [4]:
@schema
class Car(dj.Manual):
    definition = """
    vin : char(17)
    ---
    make : varchar(16)
    year : year
    """  

In [5]:
@schema 
class Classroom(dj.Manual):
    definition = """
    building_code : char(3)
    room_number : smallint unsigned 
    ---
    capacity : smallint unsigned
    """
