## Steps for Notebook
1. Getting the datasets (cleveland, census data, acs data) (https://la.arcgis.com/databrowser/index.html), (immigation, emigration) (https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/2022/) (**Ethan**)
2. Combine into one dataset via geoenrichment (per block per year) (could be zip_code per year) (**Calvin**)
4. Initial Visualization (hotspots for crime and socioeconomic factors, visualize blocks/zip_codes in Cleveland, show difference in demographics with a hotspot comparing 2010 vs 2020 census) (**Calvin** for hotspot), (**Ethan** for chloropleth maps)
5. Perform correlation analysis with hotspots. How much socioeconomic factors explain the varability in crime rate. (**Ethan**)
6. kNN, Isolation Forest, One-class SVM, Random Forest. Predict the crime rate for a zip-code/block based on socioeconomic factors. Split into a training vs test set (80/20). See which ones have the lowest MSE. (**Ethan** for Random Forest, Isolation Forest) (**Calvin** for kNN and One-class SVM)
8. Comparing the models and detemerining the best one. (**Calvin**)

Finish up to step 4 by Thursday 

In [13]:
# imports
import pandas as pd
import geopandas as gpd
import numpy as np
import os

import censusdata
from census import Census
from us import states

from matplotlib import pyplot as plt
import pygris
import folium

import arcgis
from arcgis.gis import GIS
from arcgis import geometry
from arcgis.geometry import Geometry, SpatialReference
from arcgis.features import FeatureLayer

pd.set_option('display.max_columns', None)

## Part 1: Getting Data

In [28]:
gis=GIS("https://ucsdonline.maps.arcgis.com/home", client_id="bZshlNXFuaR2KHff") 

Please sign in to your GIS and paste the code that is obtained below.
If a web browser does not automatically open, please navigate to the URL below yourself instead.
Opening web browser to navigate to: https://ucsdonline.maps.arcgis.com/sharing/rest/oauth2/authorize?response_type=code&client_id=bZshlNXFuaR2KHff&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&state=csImM0f6B52YKBSqPjmETu8URfpz3E&allow_verification=false


Enter code obtained on signing in using SAML:  ········


In [6]:
s = censusdata.search('acs5', 2015, 'label', 'graduate')
s

[('B06009PR_002E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Less than high school graduate'),
 ('B06009PR_003E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!High school graduate (includes equivalency)'),
 ('B06009PR_006E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Graduate or professional degree'),
 ('B06009PR_008E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Born in Puerto Rico!!Less than high school graduate'),
 ('B06009PR_009E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Born in Puerto Rico!!High school graduate (includes equivalency)'),
 ('B06009PR_012E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Born in Puerto Rico!!Graduate or professional degree'),
 ('B06009PR_014E',
  'PLACE OF BIRTH BY EDUCATIONAL ATTAINMENT IN PUERTO RICO',
  'Estimate!!Total!!Born 

In [3]:
def download_OH_data(var_map, year_start, year_end):
    df_final = None
    
    for yr in range(year_start, year_end + 1):
        c = Census("4977648d549eae5dd6bc0563b7c148db6c44642d", year=yr)
        
        for raw_var, alias in var_map.items():
  
            if raw_var.startswith('DP'):
                data = c.acs5dp.get(
                    ('NAME', raw_var),
                    {'for': 'tract:*', 'in': f'state:{states.OH.fips} county:039'}
                )
            else:
                data = c.acs5.get(
                    ('NAME', raw_var),
                    {'for': 'tract:*', 'in': f'state:{states.OH.fips} county:039'}
                )
            

            df_temp = pd.DataFrame(data)
            df_temp.rename(columns={raw_var: f"{alias}_{yr}"}, inplace=True)

            # Merge it into df_final
            if df_final is None:
                # If this is the first chunk of data, just assign
                df_final = df_temp
            else:
                # Otherwise, merge on the geo-id columns
                df_final = pd.merge(
                    df_final, df_temp,
                    on=['NAME', 'state', 'county', 'tract'],
                    how='outer'  # or 'inner', your choice
                )
                
    return df_final

In [7]:
codes = {
    'B06010_003E': 'income_yes',
    'B06010_004E': 'income_no',
    'B06010_005E': 'income_0_10k',
    'B06010_006E': 'income_10_25k',
    'B06010_007E': 'income_25_35k',
    'B06010_008E': 'income_35_45k',
    'B06010_009E': 'income_45_55k',
    'B06010_010E': 'income_55_65k',
    'B06010_011E': 'income_65_75k',
    'B06010_013E': 'income_over_75k',
    'B06002_001E': 'median_age',
    'B15003_022E': 'bachelors_degree',
    'B27019_004E': 'health_insurance_less_hs'
}
# looks like there is only data up to 2021, so we will limit ourselves to that
# df = download_OH_data(codes, 2015, 2023)

df = download_OH_data(codes, 2015, 2021)

df.head()

Unnamed: 0,NAME,income_yes_2015,state,county,tract,income_no_2015,income_0_10k_2015,income_10_25k_2015,income_25_35k_2015,income_35_45k_2015,income_45_55k_2015,income_55_65k_2015,income_65_75k_2015,income_over_75k_2015,median_age_2015,bachelors_degree_2015,health_insurance_less_hs_2015,income_yes_2016,income_no_2016,income_0_10k_2016,income_10_25k_2016,income_25_35k_2016,income_35_45k_2016,income_45_55k_2016,income_55_65k_2016,income_65_75k_2016,income_over_75k_2016,median_age_2016,bachelors_degree_2016,health_insurance_less_hs_2016,income_yes_2017,income_no_2017,income_0_10k_2017,income_10_25k_2017,income_25_35k_2017,income_35_45k_2017,income_45_55k_2017,income_55_65k_2017,income_65_75k_2017,income_over_75k_2017,median_age_2017,bachelors_degree_2017,health_insurance_less_hs_2017,income_yes_2018,income_no_2018,income_0_10k_2018,income_10_25k_2018,income_25_35k_2018,income_35_45k_2018,income_45_55k_2018,income_55_65k_2018,income_65_75k_2018,income_over_75k_2018,median_age_2018,bachelors_degree_2018,health_insurance_less_hs_2018,income_yes_2019,income_no_2019,income_0_10k_2019,income_10_25k_2019,income_25_35k_2019,income_35_45k_2019,income_45_55k_2019,income_55_65k_2019,income_65_75k_2019,income_over_75k_2019,median_age_2019,bachelors_degree_2019,health_insurance_less_hs_2019,income_yes_2020,income_no_2020,income_0_10k_2020,income_10_25k_2020,income_25_35k_2020,income_35_45k_2020,income_45_55k_2020,income_55_65k_2020,income_65_75k_2020,income_over_75k_2020,median_age_2020,bachelors_degree_2020,health_insurance_less_hs_2020,income_yes_2021,income_no_2021,income_0_10k_2021,income_10_25k_2021,income_25_35k_2021,income_35_45k_2021,income_45_55k_2021,income_55_65k_2021,income_65_75k_2021,income_over_75k_2021,median_age_2021,bachelors_degree_2021,health_insurance_less_hs_2021,income_yes_2022,income_no_2022,income_0_10k_2022,income_10_25k_2022,income_25_35k_2022,income_35_45k_2022,income_45_55k_2022,income_55_65k_2022,income_65_75k_2022,income_over_75k_2022,median_age_2022,bachelors_degree_2022,health_insurance_less_hs_2022,income_yes_2023,income_no_2023,income_0_10k_2023,income_10_25k_2023,income_25_35k_2023,income_35_45k_2023,income_45_55k_2023,income_55_65k_2023,income_65_75k_2023,income_over_75k_2023,median_age_2023,bachelors_degree_2023,health_insurance_less_hs_2023
0,"Census Tract 9587, Defiance County, Ohio",3494.0,39,39,958700,527.0,213.0,649.0,660.0,653.0,297.0,191.0,304.0,209.0,41.6,550.0,143.0,3585.0,555.0,240.0,759.0,621.0,703.0,215.0,155.0,337.0,169.0,42.4,514.0,79.0,3564.0,539.0,282.0,752.0,548.0,673.0,226.0,157.0,387.0,181.0,42.1,591.0,82.0,3601.0,513.0,285.0,703.0,555.0,708.0,310.0,45.0,482.0,202.0,43.3,490.0,107.0,3601.0,519.0,303.0,754.0,557.0,651.0,289.0,74.0,454.0,181.0,43.7,472.0,126.0,3547.0,609.0,348.0,680.0,347.0,676.0,344.0,75.0,468.0,255.0,47.9,453.0,150.0,3802.0,687.0,344.0,608.0,479.0,664.0,371.0,138.0,511.0,247.0,45.7,509.0,182.0,,,,,,,,,,,,,,,,,,,,,,,,,,
1,"Census Tract 9589, Defiance County, Ohio",3853.0,39,39,958900,695.0,500.0,759.0,476.0,658.0,335.0,149.0,281.0,266.0,43.2,250.0,158.0,3829.0,611.0,562.0,854.0,462.0,646.0,322.0,109.0,263.0,244.0,44.9,197.0,124.0,3777.0,583.0,553.0,719.0,488.0,552.0,428.0,155.0,299.0,258.0,44.7,168.0,106.0,3766.0,524.0,388.0,602.0,616.0,732.0,454.0,216.0,234.0,248.0,48.8,184.0,113.0,3667.0,482.0,357.0,576.0,558.0,718.0,454.0,164.0,358.0,245.0,46.0,278.0,66.0,3957.0,463.0,369.0,673.0,495.0,783.0,559.0,112.0,503.0,215.0,51.8,259.0,108.0,3840.0,452.0,293.0,560.0,589.0,707.0,543.0,152.0,544.0,210.0,51.5,322.0,111.0,,,,,,,,,,,,,,,,,,,,,,,,,,
2,"Census Tract 9581, Defiance County, Ohio",3115.0,39,39,958100,544.0,263.0,520.0,423.0,473.0,452.0,145.0,295.0,206.0,39.9,317.0,72.0,3071.0,436.0,267.0,552.0,380.0,559.0,456.0,121.0,300.0,222.0,41.1,303.0,33.0,3062.0,390.0,266.0,641.0,284.0,599.0,421.0,165.0,296.0,208.0,43.8,326.0,35.0,3113.0,389.0,187.0,647.0,330.0,628.0,397.0,159.0,376.0,222.0,42.8,372.0,49.0,3190.0,356.0,205.0,555.0,338.0,608.0,463.0,177.0,488.0,271.0,45.5,390.0,65.0,3262.0,309.0,228.0,471.0,317.0,796.0,334.0,228.0,579.0,297.0,44.1,422.0,52.0,3292.0,306.0,184.0,469.0,405.0,687.0,439.0,178.0,624.0,261.0,43.1,436.0,54.0,,,,,,,,,,,,,,,,,,,,,,,,,,
3,"Census Tract 9583, Defiance County, Ohio",3083.0,39,39,958300,592.0,417.0,658.0,482.0,466.0,284.0,68.0,116.0,129.0,40.5,176.0,207.0,3116.0,486.0,344.0,737.0,607.0,440.0,300.0,78.0,124.0,124.0,40.3,193.0,299.0,3235.0,449.0,339.0,622.0,697.0,462.0,374.0,134.0,158.0,174.0,40.2,274.0,185.0,3209.0,405.0,358.0,552.0,754.0,459.0,442.0,140.0,99.0,163.0,38.9,258.0,61.0,3212.0,422.0,264.0,585.0,696.0,495.0,376.0,185.0,189.0,164.0,38.3,280.0,98.0,2907.0,418.0,164.0,527.0,611.0,445.0,397.0,182.0,163.0,142.0,35.7,293.0,71.0,2894.0,440.0,193.0,485.0,590.0,423.0,461.0,113.0,189.0,147.0,36.0,272.0,85.0,,,,,,,,,,,,,,,,,,,,,,,,,,
4,"Census Tract 9584, Defiance County, Ohio",3029.0,39,39,958400,513.0,287.0,518.0,596.0,547.0,266.0,106.0,196.0,208.0,44.0,233.0,129.0,2976.0,545.0,329.0,432.0,686.0,418.0,235.0,129.0,202.0,219.0,43.9,243.0,135.0,3052.0,470.0,338.0,438.0,697.0,475.0,299.0,135.0,200.0,145.0,43.1,268.0,161.0,3034.0,460.0,291.0,408.0,626.0,485.0,407.0,115.0,242.0,102.0,43.8,295.0,124.0,2993.0,444.0,301.0,398.0,545.0,471.0,419.0,118.0,297.0,106.0,43.3,295.0,120.0,3005.0,433.0,286.0,359.0,510.0,499.0,506.0,72.0,340.0,90.0,37.0,275.0,111.0,2725.0,327.0,197.0,341.0,417.0,505.0,429.0,101.0,408.0,156.0,36.6,210.0,92.0,,,,,,,,,,,,,,,,,,,,,,,,,,


In [8]:
df.shape

(18, 121)

In [23]:
url = 'https://services3.arcgis.com/dty2kHktVXHrqO8i/arcgis/rest/services/Crime_Incidents/FeatureServer/0'
layer = FeatureLayer(url)

In [25]:
layer

<FeatureLayer url:"https://services3.arcgis.com/dty2kHktVXHrqO8i/arcgis/rest/services/Crime_Incidents/FeatureServer/0">

In [49]:
sdf = layer.query(where="OffenseYear >= 2015 AND OffenseYear <= 2021", out_fields="*", as_df=True)
# spatially enabled dataframe with this data. I want to try uploading it to ArcGIS Online, because this operation takes a while, but I can't 

## Part 2: Preprocessing and Combining Data

In [51]:
df.head()

Unnamed: 0,NAME,income_yes_2015,state,county,tract,income_no_2015,income_0_10k_2015,income_10_25k_2015,income_25_35k_2015,income_35_45k_2015,income_45_55k_2015,income_55_65k_2015,income_65_75k_2015,income_over_75k_2015,median_age_2015,bachelors_degree_2015,health_insurance_less_hs_2015,income_yes_2016,income_no_2016,income_0_10k_2016,income_10_25k_2016,income_25_35k_2016,income_35_45k_2016,income_45_55k_2016,income_55_65k_2016,income_65_75k_2016,income_over_75k_2016,median_age_2016,bachelors_degree_2016,health_insurance_less_hs_2016,income_yes_2017,income_no_2017,income_0_10k_2017,income_10_25k_2017,income_25_35k_2017,income_35_45k_2017,income_45_55k_2017,income_55_65k_2017,income_65_75k_2017,income_over_75k_2017,median_age_2017,bachelors_degree_2017,health_insurance_less_hs_2017,income_yes_2018,income_no_2018,income_0_10k_2018,income_10_25k_2018,income_25_35k_2018,income_35_45k_2018,income_45_55k_2018,income_55_65k_2018,income_65_75k_2018,income_over_75k_2018,median_age_2018,bachelors_degree_2018,health_insurance_less_hs_2018,income_yes_2019,income_no_2019,income_0_10k_2019,income_10_25k_2019,income_25_35k_2019,income_35_45k_2019,income_45_55k_2019,income_55_65k_2019,income_65_75k_2019,income_over_75k_2019,median_age_2019,bachelors_degree_2019,health_insurance_less_hs_2019,income_yes_2020,income_no_2020,income_0_10k_2020,income_10_25k_2020,income_25_35k_2020,income_35_45k_2020,income_45_55k_2020,income_55_65k_2020,income_65_75k_2020,income_over_75k_2020,median_age_2020,bachelors_degree_2020,health_insurance_less_hs_2020,income_yes_2021,income_no_2021,income_0_10k_2021,income_10_25k_2021,income_25_35k_2021,income_35_45k_2021,income_45_55k_2021,income_55_65k_2021,income_65_75k_2021,income_over_75k_2021,median_age_2021,bachelors_degree_2021,health_insurance_less_hs_2021,income_yes_2022,income_no_2022,income_0_10k_2022,income_10_25k_2022,income_25_35k_2022,income_35_45k_2022,income_45_55k_2022,income_55_65k_2022,income_65_75k_2022,income_over_75k_2022,median_age_2022,bachelors_degree_2022,health_insurance_less_hs_2022,income_yes_2023,income_no_2023,income_0_10k_2023,income_10_25k_2023,income_25_35k_2023,income_35_45k_2023,income_45_55k_2023,income_55_65k_2023,income_65_75k_2023,income_over_75k_2023,median_age_2023,bachelors_degree_2023,health_insurance_less_hs_2023
0,"Census Tract 9587, Defiance County, Ohio",3494.0,39,39,958700,527.0,213.0,649.0,660.0,653.0,297.0,191.0,304.0,209.0,41.6,550.0,143.0,3585.0,555.0,240.0,759.0,621.0,703.0,215.0,155.0,337.0,169.0,42.4,514.0,79.0,3564.0,539.0,282.0,752.0,548.0,673.0,226.0,157.0,387.0,181.0,42.1,591.0,82.0,3601.0,513.0,285.0,703.0,555.0,708.0,310.0,45.0,482.0,202.0,43.3,490.0,107.0,3601.0,519.0,303.0,754.0,557.0,651.0,289.0,74.0,454.0,181.0,43.7,472.0,126.0,3547.0,609.0,348.0,680.0,347.0,676.0,344.0,75.0,468.0,255.0,47.9,453.0,150.0,3802.0,687.0,344.0,608.0,479.0,664.0,371.0,138.0,511.0,247.0,45.7,509.0,182.0,,,,,,,,,,,,,,,,,,,,,,,,,,
1,"Census Tract 9589, Defiance County, Ohio",3853.0,39,39,958900,695.0,500.0,759.0,476.0,658.0,335.0,149.0,281.0,266.0,43.2,250.0,158.0,3829.0,611.0,562.0,854.0,462.0,646.0,322.0,109.0,263.0,244.0,44.9,197.0,124.0,3777.0,583.0,553.0,719.0,488.0,552.0,428.0,155.0,299.0,258.0,44.7,168.0,106.0,3766.0,524.0,388.0,602.0,616.0,732.0,454.0,216.0,234.0,248.0,48.8,184.0,113.0,3667.0,482.0,357.0,576.0,558.0,718.0,454.0,164.0,358.0,245.0,46.0,278.0,66.0,3957.0,463.0,369.0,673.0,495.0,783.0,559.0,112.0,503.0,215.0,51.8,259.0,108.0,3840.0,452.0,293.0,560.0,589.0,707.0,543.0,152.0,544.0,210.0,51.5,322.0,111.0,,,,,,,,,,,,,,,,,,,,,,,,,,
2,"Census Tract 9581, Defiance County, Ohio",3115.0,39,39,958100,544.0,263.0,520.0,423.0,473.0,452.0,145.0,295.0,206.0,39.9,317.0,72.0,3071.0,436.0,267.0,552.0,380.0,559.0,456.0,121.0,300.0,222.0,41.1,303.0,33.0,3062.0,390.0,266.0,641.0,284.0,599.0,421.0,165.0,296.0,208.0,43.8,326.0,35.0,3113.0,389.0,187.0,647.0,330.0,628.0,397.0,159.0,376.0,222.0,42.8,372.0,49.0,3190.0,356.0,205.0,555.0,338.0,608.0,463.0,177.0,488.0,271.0,45.5,390.0,65.0,3262.0,309.0,228.0,471.0,317.0,796.0,334.0,228.0,579.0,297.0,44.1,422.0,52.0,3292.0,306.0,184.0,469.0,405.0,687.0,439.0,178.0,624.0,261.0,43.1,436.0,54.0,,,,,,,,,,,,,,,,,,,,,,,,,,
3,"Census Tract 9583, Defiance County, Ohio",3083.0,39,39,958300,592.0,417.0,658.0,482.0,466.0,284.0,68.0,116.0,129.0,40.5,176.0,207.0,3116.0,486.0,344.0,737.0,607.0,440.0,300.0,78.0,124.0,124.0,40.3,193.0,299.0,3235.0,449.0,339.0,622.0,697.0,462.0,374.0,134.0,158.0,174.0,40.2,274.0,185.0,3209.0,405.0,358.0,552.0,754.0,459.0,442.0,140.0,99.0,163.0,38.9,258.0,61.0,3212.0,422.0,264.0,585.0,696.0,495.0,376.0,185.0,189.0,164.0,38.3,280.0,98.0,2907.0,418.0,164.0,527.0,611.0,445.0,397.0,182.0,163.0,142.0,35.7,293.0,71.0,2894.0,440.0,193.0,485.0,590.0,423.0,461.0,113.0,189.0,147.0,36.0,272.0,85.0,,,,,,,,,,,,,,,,,,,,,,,,,,
4,"Census Tract 9584, Defiance County, Ohio",3029.0,39,39,958400,513.0,287.0,518.0,596.0,547.0,266.0,106.0,196.0,208.0,44.0,233.0,129.0,2976.0,545.0,329.0,432.0,686.0,418.0,235.0,129.0,202.0,219.0,43.9,243.0,135.0,3052.0,470.0,338.0,438.0,697.0,475.0,299.0,135.0,200.0,145.0,43.1,268.0,161.0,3034.0,460.0,291.0,408.0,626.0,485.0,407.0,115.0,242.0,102.0,43.8,295.0,124.0,2993.0,444.0,301.0,398.0,545.0,471.0,419.0,118.0,297.0,106.0,43.3,295.0,120.0,3005.0,433.0,286.0,359.0,510.0,499.0,506.0,72.0,340.0,90.0,37.0,275.0,111.0,2725.0,327.0,197.0,341.0,417.0,505.0,429.0,101.0,408.0,156.0,36.6,210.0,92.0,,,,,,,,,,,,,,,,,,,,,,,,,,


In [55]:
year_cols = {}
for col in df.columns:
    parts = col.split('_')
    if len(parts) > 1 and parts[-1].isdigit() and len(parts[-1]) == 4:
        year = parts[-1]
        base_name = '_'.join(parts[:-1])
        if year not in year_cols:
            year_cols[year] = []
        year_cols[year].append((base_name, col))
year_cols # dict of years with lists of sets of column_name without years and column_name with years

{'2015': [('income_yes', 'income_yes_2015'),
  ('income_no', 'income_no_2015'),
  ('income_0_10k', 'income_0_10k_2015'),
  ('income_10_25k', 'income_10_25k_2015'),
  ('income_25_35k', 'income_25_35k_2015'),
  ('income_35_45k', 'income_35_45k_2015'),
  ('income_45_55k', 'income_45_55k_2015'),
  ('income_55_65k', 'income_55_65k_2015'),
  ('income_65_75k', 'income_65_75k_2015'),
  ('income_over_75k', 'income_over_75k_2015'),
  ('median_age', 'median_age_2015'),
  ('bachelors_degree', 'bachelors_degree_2015'),
  ('health_insurance_less_hs', 'health_insurance_less_hs_2015')],
 '2016': [('income_yes', 'income_yes_2016'),
  ('income_no', 'income_no_2016'),
  ('income_0_10k', 'income_0_10k_2016'),
  ('income_10_25k', 'income_10_25k_2016'),
  ('income_25_35k', 'income_25_35k_2016'),
  ('income_35_45k', 'income_35_45k_2016'),
  ('income_45_55k', 'income_45_55k_2016'),
  ('income_55_65k', 'income_55_65k_2016'),
  ('income_65_75k', 'income_65_75k_2016'),
  ('income_over_75k', 'income_over_75k_2016

In [56]:
dfs = {}
for year, cols in year_cols.items():
    new_df = df[['NAME', 'state', 'county', 'tract']].copy() 
    new_df['year'] = year
    for base_name, col in cols:
        new_df[base_name] = df[col]
    dfs[year] = new_df

In [57]:
final_df = pd.concat(dfs.values(), ignore_index=True)

In [59]:
# Changed data format into keys of name, tract, year instead 
final_df.head()

Unnamed: 0,NAME,state,county,tract,year,income_yes,income_no,income_0_10k,income_10_25k,income_25_35k,income_35_45k,income_45_55k,income_55_65k,income_65_75k,income_over_75k,median_age,bachelors_degree,health_insurance_less_hs
0,"Census Tract 9587, Defiance County, Ohio",39,39,958700,2015,3494.0,527.0,213.0,649.0,660.0,653.0,297.0,191.0,304.0,209.0,41.6,550.0,143.0
1,"Census Tract 9589, Defiance County, Ohio",39,39,958900,2015,3853.0,695.0,500.0,759.0,476.0,658.0,335.0,149.0,281.0,266.0,43.2,250.0,158.0
2,"Census Tract 9581, Defiance County, Ohio",39,39,958100,2015,3115.0,544.0,263.0,520.0,423.0,473.0,452.0,145.0,295.0,206.0,39.9,317.0,72.0
3,"Census Tract 9583, Defiance County, Ohio",39,39,958300,2015,3083.0,592.0,417.0,658.0,482.0,466.0,284.0,68.0,116.0,129.0,40.5,176.0,207.0
4,"Census Tract 9584, Defiance County, Ohio",39,39,958400,2015,3029.0,513.0,287.0,518.0,596.0,547.0,266.0,106.0,196.0,208.0,44.0,233.0,129.0


In [61]:
final_df.tract.unique()

array(['958700', '958900', '958100', '958300', '958400', '958500',
       '958600', '958800', '958200'], dtype=object)

In [60]:
sdf.head()

Unnamed: 0,OBJECTID,PrimaryKey,CaseNumber,District,UCRdesc,OffenseYear,TimeGroup,ReportedDate,OffenseMonth,OffenseDay,TimeBlock,DOWname,DOW,HourofDay,DaysAgo,OffenseDate,Statute,Zip,StatDesc,Address_Public,std_parcelpin,WARD,City,CENSUS_TRACT,CENSUS_TRACT_GEOID,CENSUS_BLOCK_GROUP,CENSUS_BG_GEOID,CENSUS_BLOCK,CENSUS_BLOCK_GEOID,LAT,LON,SHAPE
0,1,201700094950001,2017-00094950,District 4,Fraud,2017,Older,2017-03-30 22:07:00,3,30,Day,Thursday,5,18,2904,2017-03-30 22:07:00,2913.49,44105.0,Identity Theft,37XX E 71ST ST,13321034.0,Ward 12,Cleveland,Census Tract 1158,39035115800,Block Group 4,390351158004,Block 4000,390351158004000,41.456787,-81.638936,"{""x"": -9088004.822074441, ""y"": 5079952.8711490..."
1,2,201700095182001,2017-00095182,District 2,All Other Offenses,2017,Older,2017-03-31 02:18:00,3,30,Early Night,Thursday,5,22,2904,2017-03-31 02:18:00,2921.32,44109.0,Obstructing Justice,42XX W 23RD ST,912083.0,Ward 12,Cleveland,Census Tract 1057,39035105700,Block Group 1,390351057001,Block 1010,390351057001010,41.440289,-81.700059,"{""x"": -9094808.934664156, ""y"": 5077502.7123899..."
2,3,201700253587001,2017-00253587,District 3,Theft,2017,Older,2017-08-07 13:04:00,8,6,Early Night,Sunday,1,22,2775,2017-08-07 02:00:00,625.05-H,44103.0,Petty Theft,66XX WADE PARK AVE,10608059.0,Ward 7,Cleveland,Census Tract 1987,39035198700,Block Group 2,390351987002,Block 2000,390351987002000,41.515435,-81.64455,"{""x"": -9088629.768549955, ""y"": 5088668.0574445..."
3,4,201700095299002,2017-00095299,District 4,All Other Offenses,2016,Older,2017-03-31 05:20:00,1,1,Night,Friday,6,1,3358,2016-01-01 06:20:00,2905.01,44105.0,Kidnapping,101XX PRINCE AVE,13518143.0,Ward 2,Cleveland,Census Tract 1275.01,39035127501,Block Group 1,390351275011,Block 1018,390351275011018,41.45432,-81.614474,"{""x"": -9085281.648472922, ""y"": 5079586.5056212..."
4,5,201700253788003,2017-00253788,District 4,Vandalism,2017,Older,2017-08-07 16:50:00,7,26,Day,Wednesday,4,12,2786,2017-07-26 16:50:00,623.02,,Criminal Damaging Or Endangering,HARVARD AVE,,Not Located,Not Located,Not Located,Not Located,Not Located,Not Located,Not Located,Not Located,,,


## Part 3: Visualizations

## Part 4: Correlation Analysis