#SQL

##goals

-  basic understanding of RDBMS and SQL 
    - RDBMS (Relational Database Management System)
    - SQL (Structured Query Language)
-  general idea of how to construct a database in postgres
-  how to extract data from a RDBMS using SQL

## why learn RDBMS/SQL

-  SQL is everywhere
-  noSQL systems often rely on concepts/terminology from SQL
-  you will be using SQL in the future
-  efficient storage and retrieval of data is hard

##what are these things anyway

- RDBMS manage databases
- databases are defined by a schema
- databases are composed of tables
- tables have rows and columns



```sql
CREATE TABLE CUSTOMERS (
    id INTEGER PRIMARY KEY
,   name VARCHAR(50)
,   age INTEGER
,   city VARCHAR(255)
,   state VARCHAR(2));
```


```
 id | name  | age |     city      | state
----+-------+-----+---------------+-------
  1 | john  |  25 | San Francisco | CA
  2 | becky |  30 | NYC           | NY
  3 | sarah |  20 | Denver        | CO
... | 
```

##relationships

- we can model relationships in the data

```sql
CREATE TABLE VISITS (
  id INTEGER PRIMARY KEY
,  created_at TIMESTAMP
,  customer_id INTEGER REFERENCES customers(id) );
```


```
 id |     created_at      | customer_id
----+---------------------+-------------
  1 | 2015-06-20 00:00:00 |           1
  2 | 2015-07-30 00:00:00 |           1
  3 | 2015-06-20 00:00:00 |           3
  4 | 2015-04-09 00:00:00 |           1
  5 | 2015-03-09 00:00:00 |           2
... | 
```

## what do you mean relationships

-  by inserting foreign keys into tables you can link information
-  there are 3 categories of relationships
    -  one-to-one
    -  one-to-many
    -  many-to-many

## one-to-one
-  the foreign key is on either side of the relationship
-  each item only exists once in each tabl
-  could be modeled by adding a new column to the table 
    -  why might you not want to do this?

```sql
CREATE TABLE LICENSES (
  id INTEGER PRIMARY KEY
, state VARCHAR(2)
, number VARCHAR(20)
, uploaded_at TIMESTAMP
, customer_id INTEGER REFERENCES customers(id)
, UNIQUE(state, number))
```


```sql
SELECT * FROM licenses;
```

```
 id | state |   number   |     uploaded_at     | customer_id
----+-------+------------+---------------------+-------------
  1 | CO    | DL19480284 | 2013-04-18 00:00:00 |           3
  2 | CA    | DL19852984 | 2014-05-12 00:00:00 |           1
```


```sql
SELECT *
FROM licenses
WHERE customer_id=?
LIMIT 1
```

##one-to-many

-  here the foreign key goes on the many side of the relationship


```sql
CREATE TABLE VISITS (
  id INTEGER PRIMARY KEY
,  created_at TIMESTAMP
,  customer_id INTEGER REFERENCES customers(id) );
```

```sql
SELECT *
FROM visits
WHERE customer_id=?
```

```sql
SELECT *
FROM customers
WHERE id=?
LIMIT 1
```

## many-to-many

- create a table which holds two foreign keys, one for each side of relationship


```sql
CREATE TABLE PRODUCTS (
  id INTEGER PRIMARY KEY
, name VARCHAR(50)
, price FLOAT
  );
```

```sql
CREATE TABLE PURCHASES (
  	id INTEGER PRIMARY KEY
,   customer_id INTEGER REFERENCES customers(id)
,   product_id INTEGER REFERENCES products(id)
,   date TIMESTAMP
,   quantity INTEGER );
```

## normalization

- we want to minimize redundancy
- doing this requires making lots of tables
- this makes queires harder to write and potentially longer to execute
- can make database structure harder to follow

##queries

- SQL is how you query relational databases
- SQL queries are declarative
    - you tell computer what you want, not how to do it
    - specify the information you want using clauses
    - clauses are combinations of keywords and identifiers
    

```sql
    SELECT name, age 
    FROM customers;
```


##keywords


- SELECT
    - FROM
    - WHERE
- JOIN
    - ON
- GROUP BY
    - SUM
    - COUNT
    - HAVING
- ORDER BY
- LIMIT
- the query does not specify the order of operations

##order of operations

- FROM, JOIN, construct the table results will be returned from
- WHERE filters rows in this table 
- GROUP BY, and aggregations, are performed on the filtered table
- HAVING filters after aggregations have been performed
- SELECT gets the desired columns from the table
- ORDER BY sorts the final results

##joins

- joins combine multiple tables together
    - tables can be joined to themselves
- joins require a matching conditions
- the ON keyword specifies that condition
    

##inner join

- return results where both tables have non-null entries

![inner_join](https://cloud.githubusercontent.com/assets/1425450/9778836/9f669cae-572a-11e5-9c96-98b59a930b7d.png)

##null values

- empty values in the table return NULL
- SQL uses a 3-level logic system with NULL, FALSE, TRUE
    - there are disputes about this
    - the implementation of this logic system is inconsistant
- primary keys can never be NULL
- can specify no NULL values for other columns as well

##left and right joins

- left/right joins keep all of one table in the join
![left_join](https://cloud.githubusercontent.com/assets/1425450/9778839/9f69bbd2-572a-11e5-9b13-7b2c2d7a04fb.png)
![right_join](https://cloud.githubusercontent.com/assets/1425450/9779109/19ace62e-572d-11e5-9868-17a9a7e3440f.png)


## full outer join
![full_outer_join](https://cloud.githubusercontent.com/assets/1425450/9778837/9f66b90a-572a-11e5-9d29-2b6c817cc7ec.png)