# Connecting to Data from a Python Notebook

## Connect to PostGreSQL Database

In [1]:
import os

from sqlalchemy import create_engine
import pandas as pd

The below `DATABASE_URI` can be passed to both sqlalchemy and ipython-sql, allowing for interaction via SQL & Python/Pandas 

In [2]:
USER = 'jupyter'
PASSWORD = os.environ['POSTGRES_PASS']
HOST = 'localhost'
PORT = '5432'
DB = 'expunge'

DATABASE_URI = f"postgresql://{USER}:{PASSWORD}@{HOST}:{PORT}/{DB}"
engine = create_engine(DATABASE_URI)

If you want to use the below cells to query PostGres with SQL, you will need to install the `ipython-sql` and `pgspecial` extensions:
```bash
pip install --user ipython-sql pgspecial
```

In [3]:
%load_ext sql
%sql {DATABASE_URI}

List all tables in `expunge` database

In [4]:
%sql \dt

 * postgresql://jupyter:***@localhost:5432/expunge
8 rows affected.


Schema,Name,Type,Owner
public,data_100k_sample,table,jupyter
public,data_10k_sample,table,jupyter
public,data_1k_sample,table,jupyter
public,expunge,table,jupyter
public,ids_100k_sample,table,jupyter
public,ids_10k_sample,table,jupyter
public,ids_1k_sample,table,jupyter
public,test_table_jupyter_linshavers,table,jupyter


Column names and types for main `expunge` table

In [5]:
%sql \d expunge

 * postgresql://jupyter:***@localhost:5432/expunge
28 rows affected.


Column,Type,Modifiers
person_id,text,
HearingDate,date,
CodeSection,text,
codesection,text,
ChargeType,text,
chargetype,text,
Class,text,
DispositionCode,text,
disposition,text,
Plea,text,


Peek at the main `expunge` table

In [6]:
%%sql
SELECT *
FROM expunge
LIMIT 3

 * postgresql://jupyter:***@localhost:5432/expunge
3 rows affected.


person_id,HearingDate,CodeSection,codesection,ChargeType,chargetype,Class,DispositionCode,disposition,Plea,Race,Sex,fips,convictions,arrests,felony10,sevenyear,tenyear,within7,within10,class1_2,class3_4,expungable,old_expungable,expungable_no_lifetimelimit,reason,sameday,lifetime
292030000000115,2016-06-20,A.46.2-862,covered elsewhere,Misdemeanor,Misdemeanor,,Guilty In Absentia,Conviction,Tried In Absentia,White Caucasian(Non-Hispanic),Male,163,True,False,False,False,False,True,True,False,False,Automatic (pending),False,Automatic (pending),"Conviction of misdemeanor charges listed in 19.2-392.6 B with no convictions since the disposition date. However, because the disposition date is within 7 years of the current date, the record is not yet eligible for expungement",False,False
147170000000107,2012-05-23,A.46.2-862,covered elsewhere,Misdemeanor,Misdemeanor,1.0,Guilty,Conviction,Not Guilty,Hispanic,Male,712,True,False,False,False,False,False,True,False,False,Automatic,False,Automatic,Conviction of misdemeanor charges listed in 19.2-392.6 B with no convictions within 7 years from disposition date,False,False
147170000000107,2015-04-22,A.46.2-865,covered elsewhere,Misdemeanor,Misdemeanor,,Dismissed,Dismissed,Not Guilty,Hispanic,Male,712,True,True,False,False,False,True,True,False,False,Petition,True,Petition,"Dismissal of misdemeanor charges, but with arrests or charges in the past 3 years",False,False


Read data into a pandas dataframe

In [8]:
df = pd.read_sql(f"""
    SELECT "ChargeType"
    FROM expunge
""", engine)

df['ChargeType'].value_counts()

Misdemeanor    6744268
Felony         2309998
Name: ChargeType, dtype: int64