## Lecture Objectives:

- `SELECT` columns `FROM` table
- `AND`, `OR` and `NOT`
- `DISTINCT`
- `COUNT`
- `WHERE`
- `ORDER BY`
- `LIMIT`
- `BETWEEN`
- `IN`
- `LIKE` and `ILIKE`

#### Import Libraries


In [30]:
import numpy as np
import pandas as pd
import sqlite3


### Convert existing csv to SQL

In [31]:
# Read csv as pandas Data Frame
df = pd.read_csv("./data/pokemon.csv")
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


### We are going to rename the certain titles that have spaces in between

In [32]:
df = df.rename(columns={'Type 1': 'Type_1', 'Type 2': 'Type_2', 'Sp. Atk': 'Sp_Atk','Sp. Def': 'Sp_Def'})


### Two ways we can store our database with sqlite3:

### Locally

In [33]:
# # Establish a connection to a SQLite database and create a cursor object
# conn = sqlite3.connect('mydatabase.db')
# cursor = conn.cursor()

# # Create a table in the database ( we will add IF NOT EXISTS)
# create_table_query = """
# CREATE TABLE IF NOT EXISTS pokemon (
#     `#` INT PRIMARY KEY,
#     Name TEXT,
#     Type1 TEXT,
#     Type2 TEXT,
#     Total INT,
#     HP INT,
#     Attack INT,
#     Defense INT,
#     Sp_Atk INT,
#     Sp_Def INT,
#     Speed INT,
#     Generation INT,
#     Legendary BOOL
#     )
# """
# cursor.execute(create_table_query)


# # Append the existing DataFrame to the table
# df.to_sql('pokemon', conn, if_exists='append', index=False)

In [34]:
cnx = sqlite3.connect('./data/poke_database.db')

# Convert DataFrame to SQL
df.to_sql('pokemon', con=cnx, if_exists='replace', index=False)
# Create function to perform queries:
def sql_query(query):
    return pd.read_sql(query, cnx)

### Let´s check we can query our table

In [35]:
query = """
SELECT *
FROM pokemon

LIMIT 5

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,0
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,0
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,0
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,0
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,0


### `SELECT` columns `FROM` table

In [36]:
query = """
SELECT Total, HP, Speed FROM pokemon
"""
sql_query(query)

Unnamed: 0,Total,HP,Speed
0,318,45,45
1,405,60,60
2,525,80,80
3,625,80,80
4,309,39,65
...,...,...,...
795,600,50,50
796,700,50,110
797,600,80,70
798,680,80,80


### `DISTINCT` return how many unique values are there in a column

In [37]:
query = """
SELECT DISTINCT Type_2 FROM pokemon

"""
sql_query(query)

Unnamed: 0,Type_2
0,Poison
1,
2,Flying
3,Dragon
4,Ground
5,Fairy
6,Grass
7,Fighting
8,Psychic
9,Steel


### `COUNT` is a function and needs to be wrapped with ( )

In [38]:
query = """
SELECT COUNT( DISTINCT Name) FROM pokemon

"""
sql_query(query)

Unnamed: 0,COUNT( DISTINCT Name)
0,800


### `WHERE` statement allows us to specify conditions on columns for the rows to be returned

In [39]:
query = """
SELECT Name, Type_2, Attack FROM pokemon
WHERE Type_2 = "Poison" AND HP >50

"""
sql_query(query)

Unnamed: 0,Name,Type_2,Attack
0,Ivysaur,Poison,62
1,Venusaur,Poison,82
2,VenusaurMega Venusaur,Poison,100
3,Beedrill,Poison,90
4,BeedrillMega Beedrill,Poison,150
5,Gloom,Poison,65
6,Vileplume,Poison,80
7,Venonat,Poison,55
8,Venomoth,Poison,65
9,Weepinbell,Poison,90


### `ORDER BY` will help you sort rows based on column value, in either ascending or descending order.

- use `ASC` to sort in ascending order
- use `DESC` to sort in descending order
- If you leave it blank, `ORDER BY` uses `ASC` by default.

It is placed towards the end of the query. We want to do any selection and filtering first, before finally sorting.

In [50]:
query = """
SELECT Name, Type_2, Attack FROM pokemon
WHERE Type_2 = "Poison" AND HP >50
ORDER BY Type_2 Desc,Attack ASC

"""
sql_query(query)

Unnamed: 0,Name,Type_2,Attack
0,Dustox,Poison,50
1,Venonat,Poison,55
2,Foongus,Poison,55
3,Ivysaur,Poison,62
4,Gloom,Poison,65
5,Venomoth,Poison,65
6,Gengar,Poison,65
7,GengarMega Gengar,Poison,65
8,Tentacruel,Poison,70
9,Roserade,Poison,70


### `LIMIT` allows us to limit the number of rows returned for a query.

In [62]:
query = """
SELECT Name, Type_2, Attack FROM pokemon
WHERE Type_2 = "Poison" AND HP >50
ORDER BY Type_2 Desc,Attack ASC

LIMIT 3

"""
sql_query(query)

Unnamed: 0,Name,Type_2,Attack
0,Dustox,Poison,50
1,Venonat,Poison,55
2,Foongus,Poison,55


### `BETWEEN` operator can be used to match a value against a range of values:
- value BETWEEN low AND high

It is the same as saying:
- value \>= low AND value \<= high

Can combine with `NOT BETWEEN`:
- value \< low OR value \> high
- value NOT BETWEEN low AND high

Can also be used with dates. Note that you need to format dates in the ISO 8601 standard format, which is YYYY-MM-DD
- date BETWEEN "2007-01-01" AND "2007-02-01"

In [63]:
query = """
SELECT Name, Attack , Defense , Speed FROM pokemon
WHERE Attack BETWEEN 50 AND 80 
AND  Defense NOT BETWEEN 60 AND 70

"""
sql_query(query)

Unnamed: 0,Name,Attack,Defense,Speed
0,Charmander,52,43,65
1,Charmeleon,64,58,80
2,Wartortle,63,80,58
3,Pidgeotto,60,55,71
4,Pidgeot,80,75,101
...,...,...,...,...
249,Sliggoo,75,53,60
250,Klefki,80,91,75
251,Phantump,70,48,38
252,Bergmite,69,85,28


### `IN`

In [68]:
query = """
SELECT COUNT(*) FROM pokemon
WHERE HP IN (100, 105)

"""
sql_query(query)

Unnamed: 0,COUNT(*)
0,42


### `LIKE` (case-sensitive) and `ILIKE` (case-insensitive) operator allows us to perform pattern matching against string data with the use of wildcard characters.

- Percent %
    - Matches any sequence of characters
- Underscore _
    - Matches any single character

Examples %:
- All names that begin with an "A"
    - `WHERE` name `LIKE` 'A%'
- All names that end with an 'a'
    - `WHERE` name `LIKE` '%a'

Examples _:
- Using the underscore allows us to replace just a single character.
    - Get all pokemon
    - WHERE name LIKE 'Char_"

- You can use multiple underscores
- Imagine we had version string codes in the format 'Version#A4', 'Version#B7', etc ...
    - WHERE value LIKE 'Version# _ _'
- We can also combine pattern matching operators to create more complex patterns
    - WHERE name LIKE '`_`her`%`'
        - `C`her`yl`
        - `T`her`esa`
        - `S`her`ri`


In [71]:
query = """
SELECT * FROM pokemon
WHERE Name LIKE "_her%"

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,420,Cherubi,Grass,,275,45,35,45,62,53,35,4,0
1,421,Cherrim,Grass,,450,70,60,70,87,78,85,4,0


#### Notice, that after the second % it can be blank

In [75]:
query = """
SELECT * FROM pokemon
WHERE Name LIKE "%her%"

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,123,Scyther,Bug,Flying,500,70,110,80,55,80,105,1,0
1,214,Heracross,Bug,Fighting,500,80,125,75,40,95,85,2,0
2,214,HeracrossMega Heracross,Bug,Fighting,600,80,185,115,40,105,75,2,0
3,420,Cherubi,Grass,,275,45,35,45,62,53,35,4,0
4,421,Cherrim,Grass,,450,70,60,70,87,78,85,4,0
5,507,Herdier,Normal,,370,65,80,65,35,65,60,5,0
6,641,TornadusTherian Forme,Flying,,580,79,100,80,110,90,121,5,1
7,642,ThundurusTherian Forme,Electric,Flying,580,79,105,70,145,80,101,5,1
8,645,LandorusTherian Forme,Ground,Flying,600,89,145,90,105,80,91,5,1
9,692,Clauncher,Water,,330,50,53,62,58,63,44,6,0


#### We can add a NOT in front

In [77]:
query = """
SELECT * FROM pokemon
WHERE Name NOT LIKE "%her%"

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,0
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,0
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,0
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,0
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
785,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,1
786,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,1
787,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,1
788,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,1


In [80]:
query = """
SELECT * FROM pokemon
WHERE Name LIKE "B%" AND Type_2 NOT LIKE "N%"
ORDER BY Type_2

"""
sql_query(query)

Unnamed: 0,#,Name,Type_1,Type_2,Total,HP,Attack,Defense,Sp_Atk,Sp_Def,Speed,Generation,Legendary
0,257,Blaziken,Fire,Fighting,530,80,120,70,110,70,80,3,0
1,257,BlazikenMega Blaziken,Fire,Fighting,630,80,160,80,130,80,100,3,0
2,286,Breloom,Grass,Fighting,460,60,130,80,60,60,70,3,0
3,12,Butterfree,Bug,Flying,395,60,45,50,90,80,70,1,0
4,267,Beautifly,Bug,Flying,395,60,70,50,100,50,65,3,0
5,628,Braviary,Normal,Flying,510,100,123,75,57,75,80,5,0
6,339,Barboach,Water,Ground,288,50,48,43,46,41,60,3,0
7,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,0
8,15,Beedrill,Bug,Poison,395,65,90,40,45,80,75,1,0
9,15,BeedrillMega Beedrill,Bug,Poison,495,65,150,40,15,80,145,1,0
