## Step 1)

Import the necessary packages.

In [None]:
# The import command allows us to call on various libraries
import folium
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from pyproj import CRS
%matplotlib notebook

print('Packages Loaded')

## Step 2)

Read the data geocoded text file and covert it to a spatial data format.  This step is equivalent to "display xy data" in ArcGIS Pro.  Then we can quickly plot the data to make sure it imported properly.

### Question 7)
What do the X & Y axes on the plot represent?

In [None]:
BC_Data = pd.read_csv('Data/BC_Geocoded.csv',index_col='date',parse_dates=['date'])
BC_Data_dgf = gpd.GeoDataFrame(BC_Data,
    geometry=gpd.points_from_xy(BC_Data.longitude,
                                BC_Data.latitude,
                                crs=CRS("WGS84")
                               ))
BC_Data_dgf.plot()

## Step 3)

Read the Census Subdivision shapefile.  Plot the BC population data to make sure it imported properly.  Zoom and pan to see what regions have the highest populations. "Uncomment" the last two lines in this block and re-run to make sure the point layer lines up with the polygon layer.

### Question 8)
Zoom to a region of interest and take a screenshot, showing both the points and polygon layer and submit it to canvas.

In [None]:
# the .read_file() function reads shapefiles
file_name='Data/CensusSubdivisions/SimplyAnalytics_Shapefiles_2021-06-04_04_40_18_c765599d6ce1e70cd26026412f68ed7c.shp'
# file_name='Data/CensusDivisions/SimplyAnalytics_Shapefiles_2021-06-04_19_49_41_bee8b66bf9167983003870d045f2acb1.shp'
BC_csd = gpd.read_file(file_name)

BC_csd = BC_csd.rename(columns={
'VALUE0': 'Population, 2016',
                    })
fig,ax=plt.subplots()
BC_csd.plot(ax=ax,column='Population, 2016',cmap = 'Blues',edgecolor='black',legend=True)
# BC_Data_dgf.plot(ax=ax,color='r',legend=True,label='Police Involved Deaths')
# ax.legend()

## Step 4)

Loop through each row in the census subdivision layer.  Do a point in polygon vector overlay using the .within() function to find which incidents are in each polygon.  Add the total number of incidents for each subdivision as an attribute.

### Question 9)
What is the highest number of incidents in a single census subdivision?

In [None]:
BC_csd['Incidents'] = 0
for i,row in BC_csd.iterrows():
    pip = BC_Data_dgf.within(row['geometry'])
    if pip.sum()>0:
        BC_csd.loc[BC_csd.index==i,'Incidents']+=pip.sum()
BC_csd.plot(column='Incidents',cmap = 'Reds',edgecolor='black',legend=True)

## Step 5)

Normalize the data to calculate the police involved death rate.  Divide the number of incidents by the total population.  First do this for the whole province to calculate the provincial average.


### Question 10)
What is the provincial police involved death rate?
<!-- 2.88 -->

### Question 11)
What does this number mean?
<!-- For every million residents in BC, 2.88 people die from a police interaction per year -->

In [None]:
End_Year = BC_Data_dgf.index.year.max()
Start_Year = BC_Data_dgf.index.year.min()
Duration = End_Year-Start_Year
Unit = 1e6
Rate_conversion = Unit/Duration

Prov_rate = (BC_csd['Incidents'].sum()/BC_csd['Population, 2016'].sum()*Rate_conversion).round(2)
print('Province-Wide Police Involved Death Rate (June ',Start_Year,' - May ',End_Year,')')
print('per ', Unit,' Residents per year')
print(Prov_rate)

## Step 6)

Repeat the normalization process for the all census subdivisions.  Then select subdivisions with at least one incident and print the results.

### Question 12)
What is the general pattern you notice in regards to the relationship between the rate, number of incidents, and total population?  What explains this pattern?

In [None]:
BC_csd['rate']=BC_csd['Incidents']/BC_csd['Population, 2016']*Rate_conversion
BC_csd.loc[BC_csd['rate']>0,['name','Population, 2016','Incidents','rate']].sort_values(by='rate')

## Step 8)

Select all census subdivisions with populations greater than 1000 and at rates >0.  Plot them as a choropleth map.  

### Question 12)

Change the middle gin value from 10 to 20, then re-run the code block.  Take a screenshot of the map (full scale or zoomed to a specific region) and submit it to canvas.

In [None]:
# Create a webmap centered on BC at zoom level 5 with the default basemap   
Map = folium.Map(location=[53, -125],zoom_start=5)

BC_csd_select=BC_csd.loc[((BC_csd['rate']>0)&((BC_csd['Population, 2016']>1000)))]
BC_csd_select.to_file("Data/BC_csd_select.json", driver = "GeoJSON")
folium.features.Choropleth('Data/BC_csd_select.json',
                           # It will match the geometry data up with a pandas or geopandas dataframe
                            data=BC_csd_select,
                            columns=['spatial_id','rate'],
                           # They key in the GeoJSON file to match by
                            key_on='feature.properties.spatial_id',
                           # If we define bins, it will split where we tell it to
                            bins = [0,1,5,10,50,100],
                            fill_color='PuRd',
                            fill_opacity = 1,
                            smooth=2,
                           # The legend label
                            legend_name='Population, 2016'
                          ).add_to(Map)
Map