# Group API Project: Crime Rates and Unemployment Rates

## November 10, 2021 -- Jamie Mortensen, Heather Leighton-Dick


What if any relationship is there between national property crimes rates versus unemployment rates in 2010? in 2020?

Is there a noticeable difference between the years? Is there a potential correlation between crime and unemployment rates? Is there a stronger correlation in one subgroup of property crimes than the others?

### Definitions:

property crime: a category of crime, usually involving private property, that includes burglary, larceny, theft, motor vehicle theft, arson, shoplifting, and vandalism

burglary: entry into a building with the intent to commit a crime, most often theft, robbery, or murder

robbery: taking property unlawfully from a person or place by force or threat of force

larceny: the theft of physical items that are personal property, without threat of force (theft includes all variations on stealing property from another person or entity)

### Set-Up and Importing Modules

In [5]:
import requests
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import time
from pprint import pprint
import citipy
import json
from citipy import citipy
import seaborn as sns
import prettytable
import API_Keys

from API_Keys import FBI_api_key, BLS_api_key

ModuleNotFoundError: No module named 'API_Keys'

## Performing the API Calls and Collecting the Data

## Burglary Data

In [None]:
#API call
headers = {"Incident": "incident_count", "Offense": "offense_count", "Year": "data_year"}
fbi_burglary_ct = requests.get('https://api.usa.gov/crime/fbi/sapi/api/data/nibrs/burglary-breaking-and-entering/offense/national/count/?limit=1&api_key='+FBI_api_key)
fbi_json_bur_ct = json.loads(fbi_burglary_ct.text)

In [None]:
#data to json to csv
raw_fbi_bur_ct = []
raw_fbi_bur_ct.append(fbi_json_bur_ct)
fbi_unformed_bur_ct = pd.json_normalize(raw_fbi_bur_ct, record_path = ["results"])
fbi_unformed_bur_ct.to_csv('fbi_unformed_bur_ct_csv.csv')

In [None]:
#csv to dataframe
burg_df = pd.DataFrame(fbi_unformed_bur_ct)
# Sort df by year
burg_sorted = burg_df.sort_values('data_year')
# Filter for 2011-2020 data
burg_by_yr = burg_sorted[burg_sorted['data_year']>=2011]
# Reset index and drop old one
burg_reset = burg_by_yr.reset_index(drop=True)
pd.DataFrame(burg_reset)

### Burglary Data Line Plot

In [None]:
sns.set_style("darkgrid")
fig,ax=plt.subplots()
ax.plot(burg_reset['data_year'], burg_reset['offense_count'], marker='o', color = 'green')
ax.set_xlabel('Year')
ax.set_ylabel("Number of Offenses")
ax.set_title('FBI Crime Data:  Burglary')
plt.show()

Burglary trend summary:

## Robbery Data

In [None]:
#API call
fbi_robbery_ct = requests.get('https://api.usa.gov/crime/fbi/sapi/api/data/nibrs/robbery/offense/national/count/?limit=1&api_key='+FBI_api_key)

In [None]:
#data to json to csv
fbi_json_robbery_ct = json.loads(fbi_robbery_ct.text)
raw_fbi_robbery_ct = []
raw_fbi_robbery_ct.append(fbi_json_robbery_ct)
fbi_unformed_robbery_ct = pd.json_normalize(raw_fbi_robbery_ct, record_path = ["results"])
fbi_unformed_robbery_ct.to_csv('fbi_unformed_robbery_ct_csv.csv')

In [None]:
#csv to dataframe
robb_df = pd.DataFrame(fbi_unformed_robbery_ct)

# Sort df by year
robb_sorted = robb_df.sort_values('data_year')

# Filter for 2011-2020 data
robb_by_yr = robb_sorted[robb_sorted['data_year']>=2011]

# Reset index and drop old one
robb_reset = robb_by_yr.reset_index(drop=True)
pd.DataFrame(robb_reset)

### Robbery Data Line Plot

In [None]:
fig,ax=plt.subplots()
ax.plot(robb_reset['data_year'], robb_reset['offense_count'], marker='o', color = 'orange')
ax.set_xlabel('Year')
ax.set_ylabel("Number of Offenses")
ax.set_title('FBI Crime Data:  Robbery')
plt.show()

Robbery trend summary:

## Larceny Data

In [None]:
#API call
fbi_larceny_ct = requests.get('https://api.usa.gov/crime/fbi/sapi/api/data/nibrs/larceny-theft-offenses/offense/national/count/?limit=1&api_key='+FBI_api_key)
fbi_json_larceny_ct = json.loads(fbi_larceny_ct.text)

In [None]:
#data to json to csv
raw_fbi_larceny_ct = []
raw_fbi_larceny_ct.append(fbi_json_larceny_ct)
fbi_unformed_larceny_ct = pd.json_normalize(raw_fbi_larceny_ct, record_path = ["results"])
fbi_unformed_larceny_ct.to_csv('fbi_unformed_larceny_ct_csv.csv')

In [None]:
#csv to dataframe
larc_df = pd.DataFrame(fbi_unformed_larceny_ct)

# Sort df by year
larc_sorted = larc_df.sort_values('data_year')

# Filter for 2011-2020 data
larc_by_yr = larc_sorted[larc_sorted['data_year']>=2011]

# Sum count types by year
larc_totals = larc_by_yr.groupby('data_year')['offense_count'].sum()

# Reset index and keep data_year as a column
larc_reset = larc_totals.reset_index()

pd.DataFrame(larc_reset)

### Larceny Line Plot

In [None]:
fig,ax=plt.subplots()
ax.plot(larc_reset['data_year'], larc_reset['offense_count'], marker='o', color = 'purple')
ax.set_xlabel('Year')
ax.set_ylabel("Number of Offenses (in millions)")
ax.set_title('FBI Crime Data:  Larceny')
plt.show()

Larceny trend summary:

## Stolen Property Data

In [None]:
#API call
fbi_stolprop_ct = requests.get('https://api.usa.gov/crime/fbi/sapi/api/data/nibrs/stolen-property-offenses/offense/national/count/?limit=1&api_key='+FBI_api_key)
fbi_json_stolprop_ct = json.loads(fbi_stolprop_ct.text)

In [None]:
#data to json to csv
raw_fbi_stolprop_ct = []
raw_fbi_stolprop_ct.append(fbi_json_stolprop_ct)
fbi_unformed_stolprop_ct = pd.json_normalize(raw_fbi_stolprop_ct, record_path = ["results"])
fbi_unformed_stolprop_ct.to_csv('fbi_unformed_stolprop_ct_csv.csv')

In [4]:
#csv to dataframe
stolprop_df = pd.DataFrame(fbi_unformed_stolprop_ct)

# Sort df by year
stolprop_sorted = stolprop_df.sort_values('data_year')

# Filter for 2011-2020 data
stolprop_by_yr = stolprop_sorted[stolprop_sorted['data_year']>=2011]

# Reset index and keep data_year as a column
stolprop_reset = stolprop_by_yr.reset_index()

pd.DataFrame(stolprop_reset)

NameError: name 'fbi_unformed_stolprop_ct' is not defined

### Stolen Property Line Plot

In [None]:
fig,ax=plt.subplots()
ax.plot(stolprop_reset['data_year'], stolprop_reset['offense_count'], marker='o', color = 'red')
ax.set_xlabel('Year')
ax.set_ylabel("Number of Offenses")
ax.set_title('FBI Crime Data:  Stolen Property')
plt.show()

## Unemployment Rate

In [None]:
#BLS API call
base_url="https://api.bls.gov/publicAPI/v2/timeseries/data/?registrationkey="+BLS_api_key+"&catalog=true&startyear=2011&endyear=2020&calculations=true&annualaverage=true&aspects=true"

headers = {'Content-type': 'application/json'}
data = json.dumps({"seriesid": ['LNS14000000'],"startyear":"2011", "endyear":"2020"})
unemp_ct = requests.post('https://api.bls.gov/publicAPI/v2/timeseries/data/', data=data, headers=headers)
bls_json_unemp_ct = json.loads(unemp_ct.text)
for series in json_data['Results']['series']:
    x=prettytable.PrettyTable(["series id","year","period","value","footnotes"])
    seriesId = series['seriesID']
    for item in series['data']:
        year = item['year']
        period = item['period']
        value = item['value']
        footnotes=""
        for footnote in item['footnotes']:
            if footnote:
                footnotes = footnotes + footnote['text'] + ','
        if 'M01' <= period <= 'M12':
            x.add_row([seriesId,year,period,value,footnotes[0:-1]])
    output = open(seriesId + '.txt','w')
    output.write (x.get_string())
    output.close()

In [None]:
#data to json to csv
raw_bls_unemp_ct = []
raw_bls_unemp_ct.append(bls_json_unemp_ct)
bls_unformed_unemp_ct = pd.json_normalize(raw_bls_unemp_ct)
bls_unformed_unemp_ct.to_csv('bls_unformed_unemp_ct_csv.csv')

In [None]:
#csv to dataframe
unemp_df = pd.DataFrame(bls_unformed_unemp_ct)

# Sort df by year
unemp_sorted = unemp_df.sort_values('data_year')

# Filter for 2011-2020 data
unemp_by_yr = unemp_sorted[unemp_sorted['data_year']>=2011]

# Reset index and keep data_year as a column
unemp_reset = unemp_by_yr.reset_index()

pd.DataFrame(unemp_reset)

### Unemployment Rate Line Plot

Unemployment trend summary:

### All Thefts 2011–2020

In [None]:
#fbi_stolprop_ct_df.rename(columns={"data_year":"Year(L)", "offense_count":"Offense Count(L)"}, inplace=True)
#fbi_burglary_ct_df_set
#], ignore_index=True, sort=False)
#merged_inner = pd.merge(left=survey_sub, right=species_sub, left_on='species_id', right_on='species_id')
fbi_twototals_df = pd.merge(left=larc_reset, right=robb_reset, left_on='data_year', right_on='data_year') 
fbi_totals_df = pd.merge(left=fbi_twototals_df, right=burg_reset, left_on='data_year', right_on='data_year')
fbi_totals_df = fbi_totals_df[['data_year', 'Offense Count(L)', 'Offense Count(R)', 'Offense Count(B)']]
fbi_totals_df

In [None]:
fig,ax=plt.subplots()
ax.bar(fbi_totals_df["Year(L)"], fbi_totals_df["Offense Count(L)"], label="Larcenies")
ax.bar(fbi_totals_df["Year(L)"], fbi_totals_df["Offense Count(R)"], bottom=fbi_totals_df["Offense Count(L)"], label="Robberies")
ax.bar(fbi_totals_df["Year(L)"], fbi_totals_df["Offense Count(B)"], bottom = fbi_totals_df["Offense Count(L)"]+fbi_totals_df["Offense Count(R)"], label = "Burglaries")

ax.set_xticklabels(fbi_totals_df["Year(L)"], rotation=90)
ax.set_ylabel("Number of Thefts (in millions)")
ax.legend()

plt.show()

Comparisons, trends, notable features of the data

## Lessons Learned

1) Be careful of data that isn't labeled clearly; for example, "value" in the unemployment data was extremely vague
and required extra research.

2) It follows from the first point, but information literacy is important. It's not enough to be able to manipulate the data; we have to be able to figure out what data has been included and whether the labels are accurate (in the course of the project, some data were represented as is more usual in the statistics and economics fields as a way of abbreviating large numbers "normalized x100,000").

3) Sometimes data will be stored using unfamiliar python modules, like "prettytable," which needs to be downloaded in addition to the usual suspects.