# Level 1: Introduction to SQL

SQL (`Structured Query Language`) is the most widely used programming language to access information from `relational` databases. 

As the name implies, these type of databases have defined relationships between tables of data inside the database.

The tables inside these databases may look similar to Excel spreadsheets, but are more powerful and present the following advantages:

- Can store much more information.
- Storage is much more secure due to encryption.
- Many users can write queries at the same time to access information.

**Tables** are the main building block of databases. They are organized in rows and columns referred to as *records* and *fields*.

Table names should:

- be lowercase
- have no spaces, use underscores instead
- refer to a collective group or be plural

**Field** names should:

- be lowercase
- have no spaces
- be singular
- be different from other field names
- be different from the table name

**Comparison operators**:
- = equal
- > greater than
- < less than
- >= greater than or equal to
- <= less than or equal to
- <> or != not equal to

**Logical operators**:
- AND
- OR
- NOT


**Single string quotes are usually prefered ' '**




#### Import Libraries


In [51]:
import numpy as np
import pandas as pd
import sqlite3


### Convert existing csv to SQL

In [52]:
# Read csv as pandas Data Frame
df = pd.read_csv("./data/pokemon.csv")
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


### We are going to rename the certain titles that have spaces in between

In [53]:
df = df.rename(columns={'Type 1': 'Type_1', 'Type 2': 'Type_2', 'Sp. Atk': 'Sp_Atk','Sp. Def': 'Sp_Def'})


### Two ways we can store our database with sqlite3:

### Locally

In [54]:
# # Establish a connection to a SQLite database and create a cursor object
# conn = sqlite3.connect('mydatabase.db')
# cursor = conn.cursor()

# # Create a table in the database ( we will add IF NOT EXISTS)
# create_table_query = """
# CREATE TABLE IF NOT EXISTS pokemon (
#     `#` INT PRIMARY KEY,
#     Name TEXT,
#     Type1 TEXT,
#     Type2 TEXT,
#     Total INT,
#     HP INT,
#     Attack INT,
#     Defense INT,
#     Sp_Atk INT,
#     Sp_Def INT,
#     Speed INT,
#     Generation INT,
#     Legendary BOOL
#     )
# """
# cursor.execute(create_table_query)


# # Append the existing DataFrame to the table
# df.to_sql('pokemon', conn, if_exists='append', index=False)

In [55]:
cnx = sqlite3.connect('./data/poke_database.db')

# Pasamos el DataFrame de Pandas a SQL
df.to_sql('pokemon', con=cnx, if_exists='append', index=False)
# Definimos la función para hacer queries.
def sql_query(query):
    return pd.read_sql(query, cnx)

### Let´s check we can query our table

In [56]:
query = """
SELECT *
FROM pokemon

LIMIT 5

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,0
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,0
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,0
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,0
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,0


### `SELECT` columns `FROM` table

In [57]:
query = """
SELECT Total, HP, Speed
FROM pokemon
"""
sql_query(query)

Unnamed: 0,Total,HP,Speed
0,318,45,45
1,405,60,60
2,525,80,80
3,625,80,80
4,309,39,65
...,...,...,...
795,600,50,50
796,700,50,110
797,600,80,70
798,680,80,80


### `DISTINCT` return how many unique values are there in a column

In [58]:
query = """
SELECT DISTINCT Type_2
FROM pokemon

"""
sql_query(query)

Unnamed: 0,Type_2
0,Poison
1,
2,Flying
3,Dragon
4,Ground
5,Fairy
6,Grass
7,Fighting
8,Psychic
9,Steel


### `COUNT` is a function and needs to be wrapped with ( )

In [59]:
query = """
SELECT COUNT( DISTINCT Name)
FROM pokemon

"""
sql_query(query)

Unnamed: 0,COUNT( DISTINCT Name)
0,800


### `WHERE` statement allows us to specify conditions on columns for the rows to be returned

In [60]:
query = """
SELECT Name, Type_2, Attack 
FROM pokemon
WHERE Type_2 = "Poison" AND HP >50

"""
sql_query(query)

Unnamed: 0,Name,Type_2,Attack
0,Ivysaur,Poison,62
1,Venusaur,Poison,82
2,VenusaurMega Venusaur,Poison,100
3,Beedrill,Poison,90
4,BeedrillMega Beedrill,Poison,150
5,Gloom,Poison,65
6,Vileplume,Poison,80
7,Venonat,Poison,55
8,Venomoth,Poison,65
9,Weepinbell,Poison,90


### `WHERE`