# 3 - DSDJ TWS - Connecting to SQL Databases with Python

* We will be connecting to a local SQLlite databases 
* We will check out access and read SQL tables
* Finally we will then read a csv file and write the resulting dataframe to a new SQL table

## Pre-requisite

We will use the folling librairies
* Install https://pypi.org/project/ipython-sql/
* Install https://pypi.org/project/SQLAlchemy/
* Install https://www.psycopg.org/docs/

### Important libraries

In [1]:
from sqlalchemy import create_engine
import pandas as pd
import getpass

### Create a sql alchemy connection to the database

In [2]:
# path to the existing sqlLite database
database_filename = "./db/Chinook2.db"

In [3]:
# Sql Alchemy engine Creation / connect to the db
engine = create_engine('sqlite:///'+ database_filename)
engine.dialect.identifier_preparer.initial_quote = ''
engine.dialect.identifier_preparer.final_quote = ''

In [4]:
#check tables in the local sqlite database
pd.read_sql('''SELECT name FROM sqlite_master WHERE type='table' ''',engine)

Unnamed: 0,name
0,Album
1,Artist
2,Customer
3,Employee
4,Genre
5,Invoice
6,InvoiceLine
7,MediaType
8,Playlist
9,PlaylistTrack


### Create a ipython-sql connection to the database

In [5]:
%load_ext sql

In [6]:
%sql sqlite:///$database_filename

'Connected: @./db/Chinook2.db'

#### Check Options

In [7]:
 %config SqlMagic

SqlMagic(Magics, Configurable) options
------------------------------------
SqlMagic.autocommit=<Bool>
    Set autocommit mode
    Current: True
SqlMagic.autolimit=<Int>
    Automatically limit the size of the returned result sets
    Current: 0
SqlMagic.autopandas=<Bool>
    Return Pandas DataFrames instead of regular result sets
    Current: False
SqlMagic.column_local_vars=<Bool>
    Return data into local variables from column names
    Current: False
SqlMagic.displaylimit=<Int>
    Automatically limit the number of rows displayed (full result set is still
    stored)
    Current: None
SqlMagic.dsn_filename=<Unicode>
    Path to DSN file. When the first argument is of the form [section], a
    sqlalchemy connection string is formed from the matching section in the DSN
    file.
    Current: 'odbc.ini'
SqlMagic.feedback=<Bool>
    Print number of rows affected by DML
    Current: True
SqlMagic.short_errors=<Bool>
    Don't display the full traceback on SQL Programming Error
    Curr

### We have successfully connected the theSQLite database - let's query it

#### Check the Employee table - in two different ways

* **First way** - in an "interactive" way

In [8]:
%%sql
SELECT * FROM Employee LIMIT 5

 * sqlite:///./db/Chinook2.db
Done.


EmployeeId,LastName,FirstName,Title,ReportsTo,BirthDate,HireDate,Address,City,State,Country,PostalCode,Phone,Fax,Email
1,Adams,Andrew,General Manager,,1962-02-18 00:00:00,2002-08-14 00:00:00,11120 Jasper Ave NW,Edmonton,AB,Canada,T5K 2N1,+1 (780) 428-9482,+1 (780) 428-3457,andrew@chinookcorp.com
2,Edwards,Nancy,Sales Manager,1.0,1958-12-08 00:00:00,2002-05-01 00:00:00,825 8 Ave SW,Calgary,AB,Canada,T2P 2T3,+1 (403) 262-3443,+1 (403) 262-3322,nancy@chinookcorp.com
3,Peacock,Jane,Sales Support Agent,2.0,1973-08-29 00:00:00,2002-04-01 00:00:00,1111 6 Ave SW,Calgary,AB,Canada,T2P 5M5,+1 (403) 262-3443,+1 (403) 262-6712,jane@chinookcorp.com
4,Park,Margaret,Sales Support Agent,2.0,1947-09-19 00:00:00,2003-05-03 00:00:00,683 10 Street SW,Calgary,AB,Canada,T2P 5G3,+1 (403) 263-4423,+1 (403) 263-4289,margaret@chinookcorp.com
5,Johnson,Steve,Sales Support Agent,2.0,1965-03-03 00:00:00,2003-10-17 00:00:00,7727B 41 Ave,Calgary,AB,Canada,T3B 1Y7,1 (780) 836-9987,1 (780) 836-9543,steve@chinookcorp.com


* **Second way** - in a scripting way 

In [11]:
# build the query string then execute the query using SQL Alchemy
table_name = "Employee"
query_str = "SELECT * FROM " + table_name

employee_df = pd.read_sql_query(query_str, engine)
employee_df.head()

Unnamed: 0,EmployeeId,LastName,FirstName,Title,ReportsTo,BirthDate,HireDate,Address,City,State,Country,PostalCode,Phone,Fax,Email
0,1,Adams,Andrew,General Manager,,1962-02-18 00:00:00,2002-08-14 00:00:00,11120 Jasper Ave NW,Edmonton,AB,Canada,T5K 2N1,+1 (780) 428-9482,+1 (780) 428-3457,andrew@chinookcorp.com
1,2,Edwards,Nancy,Sales Manager,1.0,1958-12-08 00:00:00,2002-05-01 00:00:00,825 8 Ave SW,Calgary,AB,Canada,T2P 2T3,+1 (403) 262-3443,+1 (403) 262-3322,nancy@chinookcorp.com
2,3,Peacock,Jane,Sales Support Agent,2.0,1973-08-29 00:00:00,2002-04-01 00:00:00,1111 6 Ave SW,Calgary,AB,Canada,T2P 5M5,+1 (403) 262-3443,+1 (403) 262-6712,jane@chinookcorp.com
3,4,Park,Margaret,Sales Support Agent,2.0,1947-09-19 00:00:00,2003-05-03 00:00:00,683 10 Street SW,Calgary,AB,Canada,T2P 5G3,+1 (403) 263-4423,+1 (403) 263-4289,margaret@chinookcorp.com
4,5,Johnson,Steve,Sales Support Agent,2.0,1965-03-03 00:00:00,2003-10-17 00:00:00,7727B 41 Ave,Calgary,AB,Canada,T3B 1Y7,1 (780) 836-9987,1 (780) 836-9543,steve@chinookcorp.com


### Read a CSV and write it to a new SQLite Table

In [12]:
# csv file path
file_name = "fannie_val.csv"
path_dir = "./data/"

In [13]:
# read in the file to a pandas dataframe
df = pd.read_csv(path_dir+file_name)

In [14]:
# check the size
df.shape

(738072, 27)

In [15]:
df.head()

Unnamed: 0,id,channel,seller,interest_rate,balance,term,origination_date,first_payment_date,ltv,cltv,...,occupancy_status,state,zip,insurance_percentage,product_type,co_borrower_credit_score,mortgage_insurance_type,relocation_indicator,status,num_payments
0,402318892063,R,"WELLS FARGO BANK, N.A.",3.75,98000,360,12/2016,02/2017,85,85,...,P,PA,150,12.0,FRM,,1.0,N,0,21
1,836121250321,R,"WELLS FARGO BANK, N.A.",3.875,120000,180,07/2017,09/2017,62,62,...,P,AZ,851,,FRM,,,N,0,14
2,633126237229,R,OTHER,4.0,290000,360,03/2016,05/2016,95,95,...,P,CA,930,30.0,FRM,786.0,1.0,N,0,17
3,265655136659,C,OTHER,3.0,417000,240,09/2016,11/2016,22,22,...,P,CA,950,,FRM,,,N,0,24
4,418312332837,R,U.S. BANK N.A.,4.25,226000,360,01/2017,03/2017,80,80,...,P,IL,601,,FRM,731.0,,N,0,19


In [16]:
# Write it to the SQLlite database
df.to_sql("fannie", con = engine, index = False)

In [18]:
#check the table created in the database
pd.read_sql('''SELECT name FROM sqlite_master WHERE type='table' ''',engine)

Unnamed: 0,name
0,Album
1,Artist
2,Customer
3,Employee
4,Genre
5,Invoice
6,InvoiceLine
7,MediaType
8,Playlist
9,PlaylistTrack


In [19]:
%%sql
SELECT * FROM fannie LIMIT 5

 * sqlite:///./db/Chinook2.db
Done.


id,channel,seller,interest_rate,balance,term,origination_date,first_payment_date,ltv,cltv,borrower_count,dti,credit_score,first_time_buyer,loan_purpose,property_type,unit_count,occupancy_status,state,zip,insurance_percentage,product_type,co_borrower_credit_score,mortgage_insurance_type,relocation_indicator,status,num_payments
402318892063,R,"WELLS FARGO BANK, N.A.",3.75,98000,360,12/2016,02/2017,85,85,1,24.0,737.0,N,P,SF,1,P,PA,150,12.0,FRM,,1.0,N,0,21
836121250321,R,"WELLS FARGO BANK, N.A.",3.875,120000,180,07/2017,09/2017,62,62,1,28.0,802.0,N,C,PU,1,P,AZ,851,,FRM,,,N,0,14
633126237229,R,OTHER,4.0,290000,360,03/2016,05/2016,95,95,2,36.0,794.0,N,P,PU,1,P,CA,930,30.0,FRM,786.0,1.0,N,0,17
265655136659,C,OTHER,3.0,417000,240,09/2016,11/2016,22,22,1,21.0,804.0,N,R,SF,1,P,CA,950,,FRM,,,N,0,24
418312332837,R,U.S. BANK N.A.,4.25,226000,360,01/2017,03/2017,80,80,2,25.0,795.0,N,P,SF,1,P,IL,601,,FRM,731.0,,N,0,19


Clean-up and Drop the table 

In [20]:
%%sql

DROP TABLE fannie

 * sqlite:///./db/Chinook2.db
Done.


[]

### Close the SQLite Alchemy SQL Lite engine

In [21]:
engine.dispose()