# 1b - DSDJ TWS - Connecting to a AWS RDS Mysql instance with Python

1. We will be connecting to a Mysql databases using AWS RDS
2. We will check out access and read SQL tables
3. We will then use pandas to manipulate the resulting dataframes
4. Finally We will write a dataframe to a new SQL table

## Pre-requisite

We will use the folling librairies
* Install https://pypi.org/project/ipython-sql/
* Install https://pypi.org/project/SQLAlchemy/
* Install https://pypi.org/project/PyMySQL/

### Resources
    
* https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls
* https://github.com/catherinedevlin/ipython-sql
* https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html

### Important libraries

In [1]:
from sqlalchemy import create_engine
import pandas as pd
import getpass

### Create a sql alchemy connection to the database

In [2]:
database_host = "dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com"
database_name = "innodb"
database_user = "dsdj"

userpass = getpass.getpass("Password :")

Password :········


In [3]:
connection_str = database_user+":"+userpass+"@"+database_host+"/"+database_name

In [4]:
connection_str

'dsdj:postgresDsDJ08U@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb'

In [5]:
engine = create_engine("mysql+pymysql://"+connection_str, echo=False)
engine.dialect.identifier_preparer.initial_quote = ''
engine.dialect.identifier_preparer.final_quote = ''

### Create a ipython-sql connection to the database

In [6]:
%load_ext sql

In [7]:
%sql mysql+pymysql://$connection_str

'Connected: dsdj@innodb'

### We have successfully connected the the AWS Postgres database - let's query it

#### Check the Postgres SQL database tables

In [9]:
%%sql

SHOW TABLES;

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
0 rows affected.


Tables_in_innodb


#### Create and populate a car table

In [10]:
%%sql

CREATE TABLE `car` (
    `vin` TEXT,
    `brand` TEXT,
    `model` TEXT,
    `price` NUMERIC,
    `production_year` INT
);
INSERT INTO `car` VALUES
    ('LJCPCBLCX14500264','Ford','Focus',8000,2005),
    ('WPOZZZ79ZTS372128','Ford','Fusion',12500,2008),
    ('JF1BR93D7BG498281','Toyota','Avensis',11300,1999),
    ('KLATF08Y1VB363636','Volkswagen','Golf',3270,1992),
    ('1M8GDM9AXKP042788','Volkswagen','Golf',13000,2010),
    ('1HGCM82633A004352','Volkswagen','Jetta',6420,2003),
    ('1G1YZ23J9P5800003','Fiat','Punto',5700,1999),
    ('GS723HDSAK2399002','Opel','Corsa',null,2007);   

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
0 rows affected.
8 rows affected.


[]

In [11]:
%%sql

SHOW TABLES;

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
1 rows affected.


Tables_in_innodb
car


#### Check the CAR table - in two different ways

* First way - in an "interactive" way

In [12]:
%%sql
SELECT * FROM car

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
8 rows affected.


vin,brand,model,price,production_year
LJCPCBLCX14500264,Ford,Focus,8000.0,2005
WPOZZZ79ZTS372128,Ford,Fusion,12500.0,2008
JF1BR93D7BG498281,Toyota,Avensis,11300.0,1999
KLATF08Y1VB363636,Volkswagen,Golf,3270.0,1992
1M8GDM9AXKP042788,Volkswagen,Golf,13000.0,2010
1HGCM82633A004352,Volkswagen,Jetta,6420.0,2003
1G1YZ23J9P5800003,Fiat,Punto,5700.0,1999
GS723HDSAK2399002,Opel,Corsa,,2007


* second way - in a scripting way 

In [13]:
table_name = "car"
query_str = "SELECT * FROM " + table_name
pd.read_sql_query(query_str, engine)

Unnamed: 0,vin,brand,model,price,production_year
0,LJCPCBLCX14500264,Ford,Focus,8000.0,2005
1,WPOZZZ79ZTS372128,Ford,Fusion,12500.0,2008
2,JF1BR93D7BG498281,Toyota,Avensis,11300.0,1999
3,KLATF08Y1VB363636,Volkswagen,Golf,3270.0,1992
4,1M8GDM9AXKP042788,Volkswagen,Golf,13000.0,2010
5,1HGCM82633A004352,Volkswagen,Jetta,6420.0,2003
6,1G1YZ23J9P5800003,Fiat,Punto,5700.0,1999
7,GS723HDSAK2399002,Opel,Corsa,,2007


#### Save the result in a dataframe - also in two different way

* **First way** - in an "interactive" way

In [14]:
%%sql res <<
SELECT * FROM car

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
8 rows affected.
Returning data to local variable res


In [15]:
car_df = res.DataFrame()
car_df.head()

Unnamed: 0,vin,brand,model,price,production_year
0,LJCPCBLCX14500264,Ford,Focus,8000,2005
1,WPOZZZ79ZTS372128,Ford,Fusion,12500,2008
2,JF1BR93D7BG498281,Toyota,Avensis,11300,1999
3,KLATF08Y1VB363636,Volkswagen,Golf,3270,1992
4,1M8GDM9AXKP042788,Volkswagen,Golf,13000,2010


In [16]:
car_df.shape

(8, 5)

* **Second way** - in a scripting way

In [17]:
car2_df = pd.read_sql_query(query_str, engine)
car2_df.head()

Unnamed: 0,vin,brand,model,price,production_year
0,LJCPCBLCX14500264,Ford,Focus,8000.0,2005
1,WPOZZZ79ZTS372128,Ford,Fusion,12500.0,2008
2,JF1BR93D7BG498281,Toyota,Avensis,11300.0,1999
3,KLATF08Y1VB363636,Volkswagen,Golf,3270.0,1992
4,1M8GDM9AXKP042788,Volkswagen,Golf,13000.0,2010


#### Write a dataframe back to the database

In [18]:
# select only VW cars from the car dataframe 
filt = car_df['brand'] == "Volkswagen"
vw_car_df = car_df[filt]
vw_car_df

Unnamed: 0,vin,brand,model,price,production_year
3,KLATF08Y1VB363636,Volkswagen,Golf,3270,1992
4,1M8GDM9AXKP042788,Volkswagen,Golf,13000,2010
5,1HGCM82633A004352,Volkswagen,Jetta,6420,2003


In [19]:
# write it back to the database
vw_car_df.to_sql("vw_cars", con = engine, index = False)

#### Check if we successfully created a new table

In [20]:
# look at the table list
query_str = '''SHOW TABLES;'''

pd.read_sql_query(query_str, engine)

Unnamed: 0,Tables_in_innodb
0,car
1,vw_cars


In [21]:
# query the vw table

In [22]:
%%sql
SELECT * FROM vw_cars

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
3 rows affected.


vin,brand,model,price,production_year
KLATF08Y1VB363636,Volkswagen,Golf,3270,1992
1M8GDM9AXKP042788,Volkswagen,Golf,13000,2010
1HGCM82633A004352,Volkswagen,Jetta,6420,2003


#### Drop a table

In [23]:
%%sql 

DROP TABLE vw_cars

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
0 rows affected.


[]

In [24]:
%%sql
SHOW TABLES

 * mysql+pymysql://dsdj:***@dsdj-mysql-db.clpvihbunw2c.ap-southeast-2.rds.amazonaws.com/innodb
1 rows affected.


Tables_in_innodb
car
