## Phase 1.06 - 1.07

# SQL and Relational Databases

## What is a Database?
- In general, databases store sets of data that can be queried for use in other applications. 
- A database management system supports the development, administration and use of database platforms.


### What is a Relational Database? 
- A *relational database management system* (**RDBMS**) is a type of DBMS with a row-based table structure that connects related data elements and includes functions that maintain the security, accuracy, integrity and consistency of the data.
- The most basic **RDBMS** functions are related to *create, read, update and delete* operations, collectively known as **CRUD**.

### What is SQL?

- **SQL** (usually pronounced like the word “sequel”) stands for Structured Query Language.
- A programming language used to communicate with data stored in a **RDBMS** (relational database management system).
- SQL syntax is similar to the English language, which makes it relatively easy to write, read, and interpret.

### Schema
A relational database schema helps you to organize and understand the structure of a database by showing how all of the tables are related to each other.
<img src='https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/employees-schema.png?raw=1'>

### Relationships

The logical association among entities is called relationship. Relationships are mapped between entities in various ways. 

**Relationship Mappings**

- one to one
- one to many
- many to many


### One to One Relationship
<img src="https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/one-to-one.png?raw=1" >


### One to Many Relationship

This is the most commonly used type of relationship. Consider an e-commerce website, with the following:

Customers can make many orders.
Orders can contain many items.
Items can have descriptions in many languages.
<img src="https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/one-to-many.png?raw=1" >

### Many to Many Relationship

In some cases, you may need multiple instances on both sides of the relationship. For example, each order can contain multiple items. And each item can also be in multiple orders.
<img src="https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/many-to-many.png?raw=1" >

***For these relationships, we need to create an extra table to track the relationships:***

<img src="https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/many-to-many-junction.png?raw=1" >

### SQL Data Types

SQL data types can be broadly divided into following categories.

- Numeric data types such as int, tinyint, bigint, float, real etc.
- Date and Time data types such as Date, Time, Datetime etc.
- Character and String data types such as char, varchar, text etc.
- Unicode character string data types, for example nchar, nvarchar, ntext etc.
- Binary data types such as binary, varbinary etc.
- Miscellaneous data types – clob, blob, xml, cursor, table etc.

<img src="https://github.com/yishuen/studygroups-070620pt/blob/master/mod-1/images/data-type-mapping.png?raw=1" >

#### SQLite Data Types

Any column declared in an SQLite database is assigned a type affinity depending on its declared data type. Here the list of type affinities in SQLite:

- TEXT
- NUMERIC
- INTEGER
- REAL
- BLOB

## Using SQL in Python


We're going to play around with this Pokemon database!

<img src='https://raw.githubusercontent.com/yishuen/studygroups-070620pt/master/mod-1/images/pokemon_db.png'>

In [1]:
import pandas as pd
import sqlite3

In [2]:
# Connecting to the database.
conn = sqlite3.connect('data/pokemon.db')
conn

<sqlite3.Connection at 0x7fd8b9194c70>

In [3]:
# Look at the cursor.
cur = conn.cursor()
cur

<sqlite3.Cursor at 0x7fd8ba151c00>

In [4]:
# Executing a query.
cur.execute('SELECT * FROM pokemon')

<sqlite3.Cursor at 0x7fd8ba151c00>

In [5]:
# Showing the description.
cur.description

(('id', None, None, None, None, None, None),
 ('name', None, None, None, None, None, None),
 ('base_experience', None, None, None, None, None, None),
 ('weight', None, None, None, None, None, None),
 ('height', None, None, None, None, None, None))

In [6]:
# Get column names - longhand.
my_lst = []
for x in cur.description:
    my_lst.append(x[0])
my_lst

['id', 'name', 'base_experience', 'weight', 'height']

In [7]:
# Get column names - shorthand.
[x[0] for x in cur.description]

['id', 'name', 'base_experience', 'weight', 'height']

In [8]:
# Return the query.
cur.fetchall()

[(1, 'bulbasaur', 64, 69, 7),
 (2, 'ivysaur', 142, 130, 10),
 (3, 'venusaur', 236, 1000, 20),
 (4, 'charmander', 62, 85, 6),
 (5, 'charmeleon', 142, 190, 11),
 (6, 'charizard', 240, 905, 17),
 (7, 'squirtle', 63, 90, 5),
 (8, 'wartortle', 142, 225, 10),
 (9, 'blastoise', 239, 855, 16),
 (10, 'caterpie', 39, 29, 3),
 (11, 'metapod', 72, 99, 7),
 (12, 'butterfree', 178, 320, 11),
 (13, 'weedle', 39, 32, 3),
 (14, 'kakuna', 72, 100, 6),
 (15, 'beedrill', 178, 295, 10),
 (16, 'pidgey', 50, 18, 3),
 (17, 'pidgeotto', 122, 300, 11),
 (18, 'pidgeot', 216, 395, 15),
 (19, 'rattata', 51, 35, 3),
 (20, 'raticate', 145, 185, 7),
 (21, 'spearow', 52, 20, 3),
 (22, 'fearow', 155, 380, 12),
 (23, 'ekans', 58, 69, 20),
 (24, 'arbok', 157, 650, 35),
 (25, 'pikachu', 112, 60, 4),
 (26, 'raichu', 218, 300, 8),
 (27, 'sandshrew', 60, 120, 6),
 (28, 'sandslash', 158, 295, 10),
 (29, 'nidoran-f', 55, 70, 4),
 (30, 'nidorina', 128, 200, 8),
 (31, 'nidoqueen', 227, 600, 13),
 (32, 'nidoran-m', 55, 90, 5),
 (

In [9]:
# Try to return the query again!
cur.fetchall()

[]

### Parts of a SQL Query
* `SELECT ... FROM ...`: Which columns from which table
* `WHERE`: Conditions to filter your query by
* `JOIN`: Put tables together
* `GROUP BY`: Group and aggregate data
* `HAVING`: Filtering after a `GROUP BY`
* `ORDER BY`: How to sort the table
* `LIMIT`: How many rows to query

#### Q1

In [10]:
# Select all pokemon from the pokemon table.

# Show results in a pandas dataframe.


#### Q2

In [11]:
# Select all the rows from pokemon_types where the type_id is 3.


#### Q3

In [12]:
# Select the rows from pokemon_types where the associated type is "water".


#### Q4

In [13]:
# Find the average weight for each type. 
### Order the results from highest weight to lowest weight. 
### Display the type name next to the average weight.


#### Q5

In [14]:
# Find the names and ids the pokemon that have more than 1 type.


#### Q6

In [15]:
# Find the id of the type that has the most pokemon. 
### Display type_id next to the number of pokemon having that type.
