## In order to connect to any database, we need 5 things 🖐: 

In [1]:
HOST = 'localhost' # remote would be something like cs-dasbhboard.34789fhui3fh8f.amazo....
DB = 'northwind'
PORT = '5432'

## Excurse: Setting an environment variable

You might not want to hardcode your password in your Jupyter Notebook, especially if you upload it to git. Instead, export the variable to your operating system as an "environment variable" so it's stored in your computer; not in your python code.  
- On Mac / Linux: include `PGPASSWORD='mypassword'` in your `.bash_profile` or `.bashrc` or `.zshrc` file in the root directory (~) of your computer.  
- In Windows, there's a graphical interface for this: search for `Environment Variables` in your File Explorer (or go to: Advanced System Settings).

In [2]:
import os

In [3]:
USERNAME = os.getenv('PGUSER')

In [4]:
PASSWORD = os.getenv('PGPASSWORD')

# Postgres from Python with SQL-Alchemy

### Installations and imports

In [5]:
# Install SQLAlchemy
!pip install SQLAlchemy



In [6]:
# you also need to install this postgres-specific "adapter":
!pip install psycopg2-binary

Collecting psycopg2-binary
  Downloading psycopg2_binary-2.8.6-cp38-cp38-macosx_10_9_x86_64.macosx_10_9_intel.macosx_10_10_intel.macosx_10_10_x86_64.whl (1.5 MB)
[K     |████████████████████████████████| 1.5 MB 1.3 MB/s eta 0:00:01
[?25hInstalling collected packages: psycopg2-binary
Successfully installed psycopg2-binary-2.8.6


## 1. Connect to postgres

To connect to a database with SQL-Alchemy, we need a **connection string** (like a URL).   
Typical form of a database URL: `dialect://username:password@host:port/database`

In [7]:
#conn_string = f'postgres://{USERNAME}:{PASSWORD}@{HOST}:{PORT}/{DB}'  
conn_string_mac = f'postgres://{HOST}:{PORT}/{DB}'

In [8]:
conn_string_mac

'postgres://localhost:5432/northwind'

In [9]:
from sqlalchemy import create_engine   

In [10]:
conn = create_engine(conn_string_mac).connect()

Done with the setup! Now we can access the database. 

## 2. Create, insert and read

### Create

In [11]:
query = """CREATE TABLE newproducts (
    newproduct_id SERIAL PRIMARY KEY,
    productname VARCHAR(100),
    price FLOAT);"""

In [12]:
conn.execute(query)

<sqlalchemy.engine.result.ResultProxy at 0x7fb3df0b98b0>

### Insert

In [13]:
query2 = """INSERT INTO newproducts (productname, price) 
    VALUES ('Gouda', 5), ('Gorgonzola', 7), ('Mozzarella', 3), ('Cheddar', 4), ('Camembert', '5');"""

In [14]:
conn.execute(query2)

<sqlalchemy.engine.result.ResultProxy at 0x7fb3dec91640>

### Read

In [15]:
query3 = """SELECT * FROM newproducts WHERE price >= 4;"""

In [16]:
result = conn.execute(query3)

In [17]:
result

<sqlalchemy.engine.result.ResultProxy at 0x7fb3dec65220>

### How to get the data out of the Resultproxy? 
1. use methods .first(), .fetchall(), .fetchmany(no_records), rowcount...
2. Loop through the result
3. with pandas

In [18]:
# Method 1 

res1 = result.fetchall()    # list of rowproxies (can be accessed like tuples)

In [19]:
# Method 2 
result = conn.execute(query3)

In [20]:
for row in result: 
    print(row['productname'])

Gouda
Gorgonzola
Cheddar
Camembert


In [21]:
# Method  🐼
import pandas as pd
result = conn.execute(query3)

In [22]:
df = pd.DataFrame(result, columns=result.keys())   #result.keys(). -> returns column names of the table

In [23]:
df  # implicit datatype conversions

Unnamed: 0,newproduct_id,productname,price
0,1,Gouda,5.0
1,2,Gorgonzola,7.0
2,4,Cheddar,4.0
3,5,Camembert,5.0


In [24]:
df.dtypes

newproduct_id      int64
productname       object
price            float64
dtype: object

#### Also possible: reading data into postgres with pandas: 

In [25]:
pd.read_csv('../data/northwind_data_clean/products.csv')

FileNotFoundError: [Errno 2] No such file or directory: '../data/northwind_data_clean/products.csv'

In [None]:
df2.head()

In [None]:
from sqlalchemy.types import VARCHAR  #import all datatypes that you might need

In [None]:
df2.to_sql('orders', conn, dtype={'customerID': VARCHAR(50)})   # implicit datatype-conversions

In [None]:
# df2.to_sql('name_of_table', connection, if_exists='append', dtype={'customerID': VARCHAR(50)}) 

Some useful parameters: 
   - if_exists = 'append', ' replace', 'fail'
   - dtype: specifying datatypes for columns as dict  (import from sqlalchemy.types)
   - chunksize: number of rows written simultaniously     

## 3. SQL-injection

In [None]:
query4 = """SELECT * FROM orders WHERE "customerID" = """  + user_input

In [None]:
# SELECT * FROM orders WHERE customerid = TRUE
user_input = """'WELLI' OR TRUE"""
conn.execute(query4).fetchall()

In [None]:
# validate your user input
# parametrized queries in sqlalchemy
from sqlalchemy.sql import text


In [None]:
rep_statement = text("""SELECT * FROM orders WHERE "customerID" :x""")   # to check string case insensitive: ILIKE


In [None]:
user_input2 = 'WELLI'

In [None]:
conn.execute(prep_statement, x=user_input2).fetchall()   # SELECT * FROM orders WHERE customerid ILIKE 'W%';

### Further reading/watching/listening:
https://www.sqlalchemy.org/library.html  
https://talkpython.fm/episodes/show/5/sqlalchemy-and-data-access-in-python