Reference: [Andrei Teleron: PostgreSQL Integration with Jupyter Notebook](https://medium.com/analytics-vidhya/postgresql-integration-with-jupyter-notebook-deb97579a38d)

- "Jupyter Notebook is a great software to catalog steps taken during data visualization and analysis. It supports Python code, markdown, HTML, and thanks to a few libraries, PostgreSQL!"

## 1. Getting Started

### Prerequisites:
- Jupyter Notebook
- PostgreSQL and a database (sample database & how to load)

### Libraries to install:
- ipython-sql
- sqlalchemy
- A python database API (DBAPI) library

In [1]:
# > Libraries can be installed by using the pip install commands

# pip install ipython-sql
# pip install sqlalchemy
# pip install psycopg2

# Pandas
import pandas as pd

### ipython-SQL

Created by [catherinedevlin](https://github.com/catherinedevlin/ipython-sql) on Github which enables the use of SQL style code in Jupyter Notebook.

### sqlalchemy

Originally authored by Michael Bayer, for the purposes of this notebook, it will mainly be used to store SQL queries into a pandas dataframe.

In [2]:
# Load ipython-sql
%load_ext sql

In [3]:
# Import the create_engine() function
from sqlalchemy import create_engine

## 2. Connecting to a PostgreSQL database

The following properties can be found via `Connection` tab under the `properties` of server where the database is located.

- `dialect+driver` in this case would just be `postgresql`


- `username:password` here, you will substitute your `username` & `password`


- `hostname` is usually just `localhost`


- `database` is the __name__ of the `database to connect to`


### To connect ipython-sql to your database, use the following format: 

`%sql dialect+driver://username:password@hostname/database`

In [4]:
# Example
%sql postgresql://postgres:***@localhost/analysis

### To connect `sqlalchemy` to the database, create a new object called engine using the `create_engine()` function imported earlier:

`engine = create_engine('dialect+driver://username:password@host/database')`

In [5]:
# Example
engine = create_engine('postgresql://postgres:***@localhost/database')
print(type(engine))

<class 'sqlalchemy.engine.base.Engine'>


## 3. Writing SQL Commands in Jupyter Notebook

To enable database querying & other SQL commands, call the magic command `%%sql`:

In [6]:
%%sql 

SELECT *
FROM table;

 * postgresql://postgres:***@localhost/analysis
6 rows affected.


id,first_name,last_name,school,hire_date,salary
1,Janet,Smith,F.D. Roosevelt HS,2011-10-30,36200
2,Lee,Reynolds,F.D. Roosevelt HS,1993-05-22,65000
3,Samuel,Cole,Myers Middle School,2005-08-01,43500
4,Samantha,Bush,Myers Middle School,2011-10-30,36200
5,Betty,Diaz,Myers Middle School,2005-08-30,43500
6,Kathleen,Roush,F.D. Roosevelt HS,2010-10-22,38500


### To store query in a pandas DataFrame

Create a DataFrame object using 'pd.read_sql()' function & the engine object created with `sqlalchemy`'s  create_engine( ) function earlier. 

- `pd.read_sql( )` will take two arguments:

`1) SQL query encapsulated in quotes & 2) the engine object`

In [7]:
# Example
dataframe = pd.read_sql('SELECT * FROM table', engine)
print(type(dataframe))
display(dataframe)

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,id,first_name,last_name,school,hire_date,salary
0,1,Janet,Smith,F.D. Roosevelt HS,2011-10-30,36200.0
1,2,Lee,Reynolds,F.D. Roosevelt HS,1993-05-22,65000.0
2,3,Samuel,Cole,Myers Middle School,2005-08-01,43500.0
3,4,Samantha,Bush,Myers Middle School,2011-10-30,36200.0
4,5,Betty,Diaz,Myers Middle School,2005-08-30,43500.0
5,6,Kathleen,Roush,F.D. Roosevelt HS,2010-10-22,38500.0


### Once you have your SQL query in a DataFrame, you can continue with your data analysis in Python.