## Extraction.ipynb
This extracts dataframes from three different data sources
1. `states.csv`: the data which maps state names to state abbreviations
2. `crime_data_w_population_and_crime_rate.csv`: the county data on crime rates based on crime reports from 2013
3. `result.json`: yearly/monthly data on unemployment by county in the USA

In [1]:
import pandas as pd
import json

### State-to-abbreviation map
This is straightforward

In [2]:
state_abb = pd.read_csv("states.csv")
state_abb.head()

Unnamed: 0,State,Abbreviation
0,Alabama,AL
1,Alaska,AK
2,Arizona,AZ
3,Arkansas,AR
4,California,CA


### Crime Rate Dataframe
Also straightforward to extract

In [3]:
crime_rate_df = pd.read_csv("crime_data_w_population_and_crime_rate.csv")
crime_rate_df.head()

Unnamed: 0,county_name,crime_rate_per_100000,index,EDITION,PART,IDNO,CPOPARST,CPOPCRIM,AG_ARRST,AG_OFF,...,RAPE,ROBBERY,AGASSLT,BURGLRY,LARCENY,MVTHEFT,ARSON,population,FIPS_ST,FIPS_CTY
0,"St. Louis city, MO",1791.995377,1,1,4,1612,318667,318667,15,15,...,200,1778,3609,4995,13791,3543,464,318416,29,510
1,"Crittenden County, AR",1754.914968,2,1,4,130,50717,50717,4,4,...,38,165,662,1482,1753,189,28,49746,5,35
2,"Alexander County, IL",1664.700485,3,1,4,604,8040,8040,2,2,...,2,5,119,82,184,12,2,7629,17,3
3,"Kenedy County, TX",1456.31068,4,1,4,2681,444,444,1,1,...,3,1,2,5,4,4,0,412,48,261
4,"De Soto Parish, LA",1447.40243,5,1,4,1137,26971,26971,3,3,...,4,17,368,149,494,60,0,27083,22,31


### Unemployment JSON
This is a little trickier since we cannot directly load into a pandas dataframe using a library function; we have to construct it in a more hands-on way. Note here that we only get annual data from the year 2013.

In [12]:
with open('result.json') as json_file:
    data = json.load(json_file)

unemployment_df = pd.DataFrame(columns=['county', 'state', 'unemployment_rate'])

for state, v in data['2013']['Annual'].items():
    for county, rate in v['Unemployment Rate'].items():
        unemployment_df = unemployment_df.append({'county': county, 'state': state, 'unemployment_rate': rate}, ignore_index=True)

unemployment_df.head()

Unnamed: 0,county,state,unemployment_rate
0,Autauga County,Alabama,6.2
1,Baldwin County,Alabama,6.6
2,Barbour County,Alabama,10.3
3,Bibb County,Alabama,7.9
4,Blount County,Alabama,6.3
