# NOTE: UNDER SERIOUS CONSTRUCTION

## Background

### What is a database?

#### What is a relational database?

 A relational database is a database that organizes information into one or more tables. 

#### What is a table? 

A table is a collection of data organized into rows and columns. Tables are sometimes referred to as relations.


### What is a clause?

Clauses perform specific tasks in SQL. By convention, clauses are written in capital letters. Clauses can also be referred to as commands.

### What is a schema?


## SQL


### What is SQL? 


### Building a Database

The structure of SQL statements vary. The number of lines used do not matter. A statement can be written all on one line, or split up across multiple lines if it makes it easier to read. In this course, you will become familiar with the structure of common statements.

``` SQL
CREATE TABLE Members (
    id INTEGER, 
    name VARCHAR, 
    age INTEGER, 
    PRIMARY KEY (id, name)
);
```

``` SQL
ALTER TABLE Members 
ADD grad_year INTEGER;
```


``` SQL
DROP TABLE Members;
```

### Inserting

`INSERT INTO` is a clause that adds the specified row or rows. 

``` SQL
INSERT INTO Members (name, age, grad_year) VALUES ('Lesley Cordero', 23, 2018);
```

``` SQL
INSERT INTO Members (name, age, grad_year) VALUES ('Tabara Nosiba', 19, 2020);
```

### Selecting



``` SQL
SELECT * FROM Members;
```



``` SQL
SELECT * FROM Members 
WHERE grad_year = 2018;
```


``` SQL
UPDATE Members 
SET age = 20
WHERE name = "Tabara Nosiba";
```

``` SQL
DELETE FROM Members 
WHERE grad_year = 2018;
```



``` SQL
SELECT name FROM Members
WHERE grad_year = max(grad_year);
```

``` SQL
SELECT name FROM Members
WHERE grad_year = (SELECT max(grad_year) FROM Members);
```

``` SQL
CREATE TABLE Is_A_Student_At (
    id INTEGER,
    name VARCHAR,
    school VARCHAR,
    major VARCHAR,
    grad_year INTEGER,
    FOREIGN KEY (id, name, grad_year) REFERENCES Members,
    PRIMARY KEY(id, school));
```

``` SQL 
INSERT INTO Is_A_Student_At (id, name, school, major, grad_year) VALUES (1, 'Lesley Cordero', 'SEAS', 'Computer Science', 2018);
```

``` SQL
INSERT INTO Is_A_Student_At (id, name, school, major, grad_year) VALUES (2, 'Tabara Nosiba', 'BC', 'Computer Science', 2020);
```

``` SQL
SELECT M.name 
FROM Members M
WHERE 
```

`SELECT` statements are used to fetch data from a database. `SELECT` is a clause that indicates that the statement is a query. You will use `SELECT` every time you query data from a database. `SELECT` statements always return a new table called the result set. We can also rename the columns we want to show with `AS`.

`*` is a special wildcard character that we have been using. It allows you to select every column in a table without having to name each one individually. Here, the result set contains every column in the celebs table.


### Updating

The `UPDATE` statement edits a row in the table. You can use the UPDATE statement when you want to change existing records. `SET` is a clause that indicates the column to edit. WHERE is a clause that indicates which row(s) to update with the new column value.

The `ALTER TABLE` statement added a new column to the table. You can use this command when you want to add columns to a table. `ADD COLUMN` is a clause that lets you add a new column to a table. 

### Deleting

The `DELETE` FROM statement deletes one or more rows from a table. You can use the statement when you want to delete existing records.

### Null

`NULL` is a special value in SQL that represents missing or unknown data. Here, the rows that existed before the column was added have NULL values for twitter_handle

### Aggregations

Aggregations are used to convert many rows into a single row. Almost all aggregations come with the `GROUP BY` statement, which converts the table otherwise returned by the query into groups of tables. Each group corresponds to a unique value (or group of values) of columns which we specify in the GROUP BY statement.

### Subqueries 

Subqueries are regular SQL queries that are embedded inside larger queries. There are 3 different types of subqueries, based on what they return -


#### 2D Table 

#### 1D Array

#### Single Values

## Relationships

### One to One

* A row in a table has a relationship with another row in another table. 
* This relationship works both ways between the two tables. Each row can have a connection only with one other row from another table. 

### One to Many

* A row in one table can have a connection with multiple rows in another table
* In reverse, multiple rows in a table can be connected to the same row in another table


### Foriegn Keys

Foreign Keys are a fundamental component of SQL that allows us to reference one table from another. Foreign Keys are stored as an additional column and is usually entered in the child table and will reference the parent table.


### Constraints

``` SQL
CREATE TABLE Reserves
    ( sname CHAR(10),
    bid INTEGER,
    day DATE,
    PRIMARY KEY(bid, day),
    CONSTRAINT noInterlakeRes
    CHECK (`Interlake` <> 
        (SELECT B.bname
        FROM Boats B
        WHERE B.bid = bid))))
```

can be named

### Constraints Over Multiple Relations

``` SQL
CREATE TABLE Sailors
    ( sid INTEGER,
    sname CHAR(10),
    rating INTEGER,
    age REAL,
    PRIMARY KEY (sid),
    CHECK
    ( (SELECT COUNT (S.sid) FROM Sailors S)
    + (SELECT COUNT (B.bid) FROM Boats B) < 100 )
```

### Assertions

``` SQL
CREATE ASSERTION smallClub
CHECK
( (SELECT COUNT (S.sid) FROM Sailors S)
+ (SELECT COUNT (B.bid) FROM Boats B) < 100)
```

#### Enforcing Total Participation

``` SQL
CREATE ASSERTION BusySailors
CHECK (
    NOT EXISTS (
        SELECT sid FROM Sailors
        WHERE sid NOT IN (
            SELECT sid FROM Reserves)
       )
)
```

### Triggers

A procedure that starts automatically when a specified change occurs, such as INSERT, UPDATE, DELETE.

#### Row Level Triggers

Invoked once for each row affected by the triggering statement. 

#### Statement Level Trigger

Invoked only once regardless of the number of rows affected. 

Triggers may fire before or after a change occurs. Could also be asynchronous or deferred. 

If there exists a condition, it needs to be true for the trigger to fire.

## Privileges

GRANT and REVOKE

Authorization Graph -- nodes are users, arcs are privileges

## Transactions

series or list of actions. they can consist of reads or writes of DB objects. 

RT(O) is for reading, WT(O) is for writing. 

Transactions must specificy whether the final action is a commit or abort. 

## Schedules

list of actions from a set of transactions. it represents an actual or potential execution sequence. 

### Relational Algebra

Relational Algebra is what allows us to represent relational database logic without concrete SQL. 

#### Selection 

Selection is indicated with the Ïƒ symbol. As the name suggests, the $\sigma$ symbol indicates that a subset of rows will be selected based off of a particular condition. 

Recall from an earlier section that in SQL, a select statements looks something like this:

``` sql
SELECT * FROM Table;
```

#### Projection 

On an opposite note, we have projection which _deletes_ unwanted columns and is indicated with a $\pi$. It's important to note that the projection operator deletes columns _not_ in the projection list, which will always be indicated as a subscript to the $\pi$ symbol.  

Given the following table, which we'll call S2, what would the following operation output: $ \pi_{name,major}(S2)$ 


| student_id  | name  | major     | minor    |
| ----------- |:-----:| ---------:|---------:|
| 1           | Helen | Biology   | Dance    |
| 2           | Menna | Sociology | French   |
| 3           | Dan   | Comp Sci  | Music    |
| 4           | Will  | Comp Sci  | Business |

Since the operation only keeps the name and major columns, we end up with:

| name  | major     | 
|:-----:| ---------:|
| Helen | Biology   | 
| Menna | Sociology | 
| Dan   | Comp Sci  | 
| Will  | Comp Sci  |

# Challenge

What is the output for the following operation: $ \pi_{major}(S2) $?


## Solution & Walkthrough

The answer might not be exactly what you expect. If you expected 4 rows rather than the 3 shown below, what you missed is that relational algebra deals with _sets_. And in sets, there are no duplicates. Therefore, we remove the second Comp Sci value.

| major     | 
|----------:|
| Biology   | 
| Sociology | 
| Comp Sci  | 


#### Cross Product

Oftentimes, when completing a task, two operations need to be combined to accomplish a task. Luckily, relational algebra provides us with the symbol $\times$ to indicate such. 

#### Set-Difference

Recall the difference operation from set theory, which outputs a set containing all elements in the first set but not in the second.

For example, if we have two sets, `A = {1,2,3,4,5}` and `B = {1,3,5,7}`, the operation `A - B` would leave us with `{1,3,5}`. Just as in set theory, set difference is represented with a subtraction sign (-) in relational algebra. 

### sqlite3 with Python



In [1]:
import sqlite3 

In [2]:
db = sqlite3.connect("./database.db")

In [3]:
cursor = db.cursor()

In [5]:
cursor.execute('''
CREATE TABLE students(
    id INTEGER PRIMARY KEY, 
    name TEXT
);
''')

<sqlite3.Cursor at 0x1067033b0>

In [6]:
db.commit()

In [8]:
cursor.execute(''' INSERT INTO students (name) VALUES (?) ''',  (str("Lesley"),))
db.commit()

## SQLAlchemy

Engine: common interface to database from SQLAlchemy

In [None]:
from sqlalchemy import create_engine
engine = create_engine('')
connection = engine.connect()

### Connection strings

database driver + filename

In [None]:
engine.table_names() # returns list of tables


In [None]:
### Reflection 

In [None]:
from sqlalchemy import MetaData, Table

metadata = MetaData() # stores DB info 

census = Table('census', metadata, autoload=True, autoloud_with=engine)

repr(census) # tells us columns and types