# Workflow

## Import statements & function definitions

In addition to geopy and folium, we're going to import a few more packages:
* [geopandas](https://geopandas.org/) is a geospatial extension to pandas.
* [branca](https://pypi.org/project/branca/) module to help us create some simple choropleth maps.
* [ipywidgets](https://ipywidgets.readthedocs.io/en/latest/) to create a progress bar

It is best practice to define our functions at the very beginning of our code, so we'll run the plot_point() function here.

In [1]:
import folium
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import branca.colormap as cm
from geopy.geocoders import MapBox
from geopy.point import Point
from ipywidgets import FloatProgress

def plot_point(Map,X,Y,Popup_Text,Color='red',Radius=5,Opacity=.75,LineColor='black',LineWidth=.15):
    folium.CircleMarker(
        # The coordiatnates
        location=[X,Y],
        # Text description
        popup=Popup_Text,
        # sets the fill color for the point
        fill_color=Color,
        # Size of the marker
        radius=Radius,
        # Opacity of the circle
        fill_opacity = Opacity,
        # Sets the line color for the edge
        color=LineColor,
        # Width of the border line
        line_weight=LineWidth,
    ).add_to(Map)

## Enter your access token!
First, you must find your [access token](https://account.mapbox.com/access-tokens/).  Copy and paste it into the code below.
* If you get an error message here, it is because you didn't paste in the access token properly

In [2]:
access_token="pk.eyJ1IjoianVuZXNwYWNlYm9vdHMiLCJhIjoiY2twY3g4aXloMWFlcDJzbXN3aG95aG5uZiJ9.mFiJt0MIfL1MiJ2rB2xhKQ"
if access_token == "":
    print('Enter your access token to continue')
else:
    geolocator = MapBox(api_key=access_token)
    print('Mapbox Goelocator Loaded')

Mapbox Goelocator Loaded


## Importing the text data

We'll use a Pandas to import the a .csv file, even if its stored in a remote location like a github repository.

In [3]:
Dpath = 'https://raw.githubusercontent.com/Police-Involved-Deaths-CA/data/main/MostRecentUpdate/Police_Killings_and_Police_Inolved_Deaths.csv'
PID_Canada = pd.read_csv(Dpath,
                        parse_dates=['date'],
                        ).set_index('date',drop = True)

PID_Canada

Unnamed: 0_level_0,INDEX,id_victim,first_name,last_name,middle_name,age,gender,race,prov,department,...,substance_abuse,charge_type,Comp,id_incident,KCC_posts,ID,Temp_Date,summary,ds_rank,Type
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2000-01-05,235,235_KCC,Paul,Murdock,,25.0,,Not Specified,ON,Toronto Police Service,...,,,,,56_KCC,1066_KCC,2000.001,,,Police Involved Death
2000-01-19,236,236_KCC,Lloyd,Dustyhorn,,53.0,Male,Indigenous,SK,Saskatoon Police Service,...,,,,,50_KCC,1089_KCC,2000.001,,,Police Killing
2000-01-29,237,237_KCC,Rodney,Naitus,,25.0,Male,Indigenous,SK,Saskatoon Police Service,...,,,https://en.wikipedia.org/wiki/Neil_Stonechild,,50_KCC,1090_KCC,2000.001,,,Police Killing
2000-01-30,238,0413_V1,Stuart,Mitchell,,49.0,Male,Not Specified,ON,Toronto Police Service,...,Yes,,,413.0,56_KCC,507_KCC,2000.001,,,Police Killing
2000-02-03,239,239_KCC,Lawrence,Wegner,,30.0,Male,Indigenous,SK,Saskatoon Police Service,...,,,https://en.wikipedia.org/wiki/Neil_Stonechild,,50_KCC,1091_KCC,2000.002,,,Police Killing
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2011-08-15,1375,MS_23,Lynn,Kalmring,,55.0,Female,White,BC,RCMP,...,,,,,,,2011.080,,,Police Killing
2013-02-15,1376,MS_24,Lena,Anderson,,,Female,Indigenous,ON,Nishnawbe-Aski Police Service,...,,,,,,,2013.020,,,Police Involved Death
2019-12-29,1377,MS_25,,,,32.0,Male,Not Specified,AB,RCMP,...,,,,,,,2019.120,,,Police Killing
2020-10-04,1378,MS_26,Jean,Belhumeur,,41.0,Male,White,QC,Surete du Quebec,...,,,,,,,2020.100,,,Police Killing


# Select Recent Deaths in BC

In [4]:
PID_BC = PID_Canada.loc[((PID_Canada['prov']=='BC')&
                          (PID_Canada['prov'].index.year>=2016))].copy()

PID_BC.groupby('cause_death').count()['INDEX'].sort_values()

cause_death
Drowning/Hypothermia     1
Hit During Pursuit       1
Medical distress         1
Police Dog               1
Restraint                1
Crash During Pusuit      3
Overdose                 3
Fall                     5
Intermediat weapon       8
Physical force          10
Missing                 12
Gunshot                 24
Name: INDEX, dtype: int64

## Geocode the Locations

This dataset has postal codes, which are is a very specific identifier.  It also came with the street address of the incident, but I've removed that information to for privacy sake. We'll search for each incident using the following search:
* City + address + Province
    
We have to create some new columns to hold the new data (latitude and longitude).  Some of our requests may fail, so we'll create a geocoding_Notes column to denote failures.

In [12]:

PID_BC['address_intersection']=PID_BC['address_intersection'].fillna(' ').str.replace(' of ',' ')
PID_BC['latitude'] = np.nan
PID_BC['longitude'] = np.nan
PID_BC['geocoding_Notes'] = ''



i = 0
prog = FloatProgress(min=0, max=100,description='Progress:')
prog.value=0
display(prog)

# iterrows() allows us to loop through row by row
for index, row in PID_BC.iterrows():
    # try statements let us attempt something.
    try:
        # ' '.join() concatenates the records with spaces between
        attempt = ' '.join([row['address_intersection'],row['city_town'], row['prov']])

        # We'll querry the geocoder.  We'll set timeout to 3 seconds so it has ample time for each query
        g = geolocator.geocode(attempt,timeout=3,country='CA')
        PID_BC.loc[PID_BC['id_victim']==row['id_victim'],
                       ['latitude','longitude','geocoding_attempt']]=g.latitude,g.longitude,attempt
        
#     #if the try fails, we get an exception, we'll add to the geocoding notes and pass to the next row
    except:
        PID_BC.loc[PID_BC['id_victim']==row['id_victim'],
                    ['geocoding_Notes','geocoding_attempt']]='Failed',attempt
        pass
    
    ## Show the progress
    i += 1
    prog.value=i/len(PID_BC)*100
    
print('Geocoding Done.')
print('Number of Failures: ',PID_BC.loc[PID_BC['geocoding_Notes']=='Failed','id_victim'].count())

FloatProgress(value=0.0, description='Progress:')

Geocoding Done.
Number of Failures:  0


In [13]:
BC_coords = geolocator.geocode('BC, Canada')


## We can set the basemap to a basic black and white
BCMap = folium.Map(
    location=[BC_coords.latitude,BC_coords.longitude],
    zoom_start=5,
#     tiles='Stamen Toner'
)

PID_BC[['race','gender','age','city_town','prov','postal_code']]=PID_BC[['race','gender','age','city_town','prov','postal_code']].fillna('')
for index, row in PID_BC.iterrows():
    # if the geocoding didn't fail, we'll plot the point, colored by the province
    if row['geocoding_Notes'] != 'Geocoding Failed':
        plot_point(Map=BCMap,
                   X=row['latitude'],
                   Y=row['longitude'],
                   Popup_Text=row['geocoding_attempt'],
#                    Color=color_Scheme[row['prov']]
                  )

BCMap

In [None]:
from folium import plugins


import numpy as np

Van_coords = geolocator.geocode('Vancouver, BC, Canada')


N = 100
data = np.array(
    [PID_BC.latitude.values,
     PID_BC.longitude.values
    ]
).T
popups = list(PID_BC.department.fillna('Missing').values)
BCmap2 = folium.Map([Van_coords.latitude,Van_coords.longitude], zoom_start=7)

plugins.MarkerCluster(data, popups=popups,
                     ).add_to(BCmap2)

BCmap2



# Making a Choroplet and Adding Points

In [None]:
import branca.colormap as cm


BC_Final_Map = folium.Map(
    location=[BC_coords.latitude,BC_coords.longitude],
    zoom_start=5,
    tiles='Stamen Toner'
)

# colormap = cm.linear.PuRd_05.scale(BC_Sub_Div['Total Population, 2020'].min(), BC_Sub_Div['Total Population, 2020'].max())
colormap = cm.LinearColormap(['#f5f8fa','#0f91f5'],
                             vmin=0,
                             vmax=17000)
colormap = colormap.to_step(n=10)#
colormap.caption = 'Total Population'
colormap.add_to(BC_Final_Map)

folium.GeoJson(
    'data/BC_Sub_Div.json',
    name='Total Population',
    smooth_factor=1.75,
    style_function = lambda x:{'color':'black',
                               "weight": 1,
                               "fillOpacity": 1,
                              'fillColor':colormap(x['properties']['Population'])
                              },
    tooltip=folium.features.GeoJsonTooltip(fields=['Population',
                                                   'Visible Minority',
                                                   'Indigenous Identity',
                                                   'Indigenous_Pct'],
                                           aliases=['Total Population, 2021',
                                                    'Visible Minority Population, 2021',
                                                    'Indigineous Population, 2021',
                                                    'Percent Indigineous, 2021',]
                                          ),
    show = True
).add_to(BC_Final_Map)


group = folium.FeatureGroup(name='Police Involved Deaths since 2016')
for index, row in PID_BC.iterrows():
    # if the geocoding didn't fail, we'll plot the point, colored by the province
    if row['geocoding_Notes'] != 'Geocoding Failed':
#         print(row['race'], row['gender'],str(row['age']))
        plot_point(Map=group,
                   X=row['latitude'],
                   Y=row['longitude'],
                   Popup_Text=str(row['race'])+' '+ str(row['gender']) + ' '+ str(row['age'])
                     + ' '+ str(row['geocoding_attempt']),
                   Color='red'#color_Scheme[row['prov']]
                  )#.add_to(Toronto_Map)

group.add_to(BC_Final_Map)

folium.LayerControl().add_to(BC_Final_Map)
BC_Final_Map

## Saving the Map data

In [None]:
PID_BC.to_csv('data/PID_BC_Geocoded.csv')
print('Geocoded data Saved')
# BC_Final_Map.save('../BC_Police_Involved_Deaths.html')
# print('Map Saved')