In this notebook, you'll see how to connect to a Postgres database using the sqlalchemy library.

For this notebook, you'll need both the `sqlalchemy` and `psycopg2` libraries installed.

In [18]:
#!pip install psycopg2-binary

In [19]:
#!pip install psycopg2

In [20]:
from sqlalchemy import create_engine, text



First, we need to create a connection string. The format is

 ```<dialect(+driver)>://<username>:<password>@<hostname>:<port>/<database>```

To connect to the Lahman baseball database, you can use the following connection string.

In [22]:
database_name = 'prescribers'    # Fill this in with your prescribers database name

connection_string = f"postgresql://postgres:postgres@localhost:5432/{database_name}"

Now, we need to create an engine and use it to connect.

In [24]:
engine = create_engine(connection_string)

sqlalchemy works well with pandas to convert query results into dataframes.

In [26]:
import pandas as pd

First, let's write a meaningful query.

In [28]:
query = 'SELECT * FROM prescriber'

Now, bring it all together using the following syntax.

In [30]:
with engine.connect() as connection:
    people = pd.read_sql(text(query), con = connection)

people.head()

Unnamed: 0,npi,nppes_provider_last_org_name,nppes_provider_first_name,nppes_provider_mi,nppes_credentials,nppes_provider_gender,nppes_entity_code,nppes_provider_street1,nppes_provider_street2,nppes_provider_city,nppes_provider_zip5,nppes_provider_zip4,nppes_provider_state,nppes_provider_country,specialty_description,description_flag,medicare_prvdr_enroll_status
0,1003000000.0,BLAKEMORE,ROSIE,K,FNP,F,I,TENNESSEE PRISON FOR WOMEN,3881 STEWARTS LANE,NASHVILLE,37243,1,TN,US,Nurse Practitioner,S,N
1,1003012000.0,CUDZILO,COREY,,M.D.,M,I,2240 SUTHERLAND AVE,SUITE 103,KNOXVILLE,37919,2333,TN,US,Pulmonary Disease,S,E
2,1003013000.0,GRABENSTEIN,WILLIAM,P,M.D.,M,I,1822 MEMORIAL DR,,CLARKSVILLE,37043,4605,TN,US,Family Practice,S,E
3,1003014000.0,OTTO,ROBERT,J,M.D.,M,I,2400 PATTERSON STREET SUITE 100,,NASHVILLE,37203,2786,TN,US,Orthopedic Surgery,S,E
4,1003018000.0,TODD,JOSHUA,W,M.D.,M,I,1819 W CLINCH AVE,SUITE 108,KNOXVILLE,37916,2435,TN,US,Cardiology,S,E


* Which Tennessee counties had a disproportionately high number of opioid prescriptions?

In [42]:
query_1 = ''' With tn_prescriber As (
	Select p.npi As npi,
	fc.*,
	population.population
From prescriber as p
Left Join zip_fips as zf
On p.nppes_provider_zip5 = zf.zip
Left Join fips_county as fc
On zf.fipscounty = fc.fipscounty
Left Join population
On zf.fipscounty = population.fipscounty
Where p.nppes_provider_state Like 'TN'),
	opioid_prescriptions As(
	Select prescriber.nppes_provider_last_org_name As last_name,
		prescription.*,
		d.*
	From prescriber
	Left Join prescription
	Using (npi)
	Left Join drug as d
	Using (drug_name)
	Where prescriber.nppes_provider_state Like 'TN'
		And d.opioid_drug_flag Like 'Y'
	Order By 2)
Select tn_prescriber.county,
	Sum(opioid_prescriptions.total_claim_count) As opioid_count,
	tn_prescriber.population,
	Round(sum(opioid_prescriptions.total_claim_count)/population, 4) As opioid_ratio
From tn_prescriber
Left Join opioid_prescriptions
On tn_prescriber.npi = opioid_prescriptions.npi
Where tn_prescriber.population Is Not Null
Group By 1, 3
Order By 4 DESC;'''

For much more information about SQLAlchemy and to see a more “Pythonic” way to execute queries, see Introduction to Databases in Python: https://www.datacamp.com/courses/introduction-to-relational-databases-in-python

In [44]:
with engine.connect() as connection:
    people = pd.read_sql(text(query_1), con = connection)

people.head()

Unnamed: 0,county,opioid_count,population,opioid_ratio
0,MOORE,44689.0,6302.0,7.0912
1,JACKSON,60726.0,11573.0,5.2472
2,VAN BUREN,23578.0,5675.0,4.1547
3,TROUSDALE,29060.0,8773.0,3.3124
4,OVERTON,63335.0,21995.0,2.8795
