<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Libraries" data-toc-modified-id="Libraries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Libraries</a></span></li><li><span><a href="#Data" data-toc-modified-id="Data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Data</a></span><ul class="toc-item"><li><span><a href="#To-perform-this-map-we-need:" data-toc-modified-id="To-perform-this-map-we-need:-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>To perform this map we need:</a></span></li><li><span><a href="#Merge-the-two-datasets" data-toc-modified-id="Merge-the-two-datasets-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Merge the two datasets</a></span></li></ul></li><li><span><a href="#Create-a-map" data-toc-modified-id="Create-a-map-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Create a map</a></span></li></ul></div>

![image](https://miro.medium.com/max/600/1*XLqDpdW6SC_MOOlx007N3w.gif)

# Libraries 

In [1]:
import pandas as pd
import geopandas as gpd
from keplergl import KeplerGl



# Data 

## To perform this map we need:

Your dataset will need to include three variables:
- a `datetime` variable (to enable time series functionality)
- a `latitude` variable (from the polygon centroid for each county)
- a `longitude` variable (from the polygon centroid for each county)

The police interaction dataset I used (see dataset source/validity section) includes the street address, city, state, zip code, and county information. We’ll map the data by state counties because we can use the National Weather Service’s GIS shape files to extract each county’s central longitude and latitude using the centroid of each county’s respective geospatial shape.

In [3]:
df = pd.read_csv("../Input/MPV_2.csv", sep = ";")
df.head()

Unnamed: 0,Victim's name,Victim's age,Victim's gender,Victim's race,Date of Incident (month/day/year),Street Address of Incident,City,State,Zipcode,County,Cause of death,Symptoms of mental illness?,Armed/Unarmed Status,Alleged Weapon,Alleged Threat Level,Fleeing,Geography,MPV ID,Encounter Type (DRAFT)
0,Willie Roy Allen,57,Male,,31/03/2021,2626 Lithonia Industrial Blvd.,Lithonia,GA,30058.0,DeKalb,"Gunshot, Taser",No,Allegedly Armed,gun,other,foot,Suburban,8728,Violent crime
1,Jeffrey Ely,40,Male,White,31/03/2021,247 Sullivan St.,Claremont,NH,3743.0,Sullivan,Gunshot,No,Allegedly Armed,gun,attack,not fleeing,Rural,8729,Violent crime
2,Ivan Cuevas,27,Male,Hispanic,31/03/2021,North Conyer Street,Visalia,CA,93291.0,Tulare,Gunshot,Drug or alcohol use,Allegedly Armed,knife,other,not fleeing,Urban,8730,Domestic disturbance (family violence)
3,Anthony Alvarez,22,Male,Hispanic,31/03/2021,W. Eddy St. and N. Laramie Ave.,Chicago,IL,60641.0,Cook,Gunshot,No,Allegedly Armed,gun,,foot,Urban,8731,None/Unknown
4,Aaron Christopher Pouche,35,Male,,31/03/2021,E. 8th St. and S. Carlisle,Independence,MO,64054.0,Jackson,Gunshot,No,Allegedly Armed,gun,attack,not fleeing,Suburban,8732,Violent crime


In [4]:
# We add to time info
df['datetime'] = df['Date of Incident (month/day/year)'].astype(str) + ' 0:00'

df.head()

Unnamed: 0,Victim's name,Victim's age,Victim's gender,Victim's race,Date of Incident (month/day/year),Street Address of Incident,City,State,Zipcode,County,Cause of death,Symptoms of mental illness?,Armed/Unarmed Status,Alleged Weapon,Alleged Threat Level,Fleeing,Geography,MPV ID,Encounter Type (DRAFT),datetime
0,Willie Roy Allen,57,Male,,31/03/2021,2626 Lithonia Industrial Blvd.,Lithonia,GA,30058.0,DeKalb,"Gunshot, Taser",No,Allegedly Armed,gun,other,foot,Suburban,8728,Violent crime,31/03/2021 0:00
1,Jeffrey Ely,40,Male,White,31/03/2021,247 Sullivan St.,Claremont,NH,3743.0,Sullivan,Gunshot,No,Allegedly Armed,gun,attack,not fleeing,Rural,8729,Violent crime,31/03/2021 0:00
2,Ivan Cuevas,27,Male,Hispanic,31/03/2021,North Conyer Street,Visalia,CA,93291.0,Tulare,Gunshot,Drug or alcohol use,Allegedly Armed,knife,other,not fleeing,Urban,8730,Domestic disturbance (family violence),31/03/2021 0:00
3,Anthony Alvarez,22,Male,Hispanic,31/03/2021,W. Eddy St. and N. Laramie Ave.,Chicago,IL,60641.0,Cook,Gunshot,No,Allegedly Armed,gun,,foot,Urban,8731,None/Unknown,31/03/2021 0:00
4,Aaron Christopher Pouche,35,Male,,31/03/2021,E. 8th St. and S. Carlisle,Independence,MO,64054.0,Jackson,Gunshot,No,Allegedly Armed,gun,attack,not fleeing,Suburban,8732,Violent crime,31/03/2021 0:00


In [5]:
# But... in this dataset we don't have lat long information. We use a shapefile
#file add this info in our dataset
shapefile_data = gpd.read_file("../Input/c_10nv20.shp")

In [6]:
shapfile_raw = pd.DataFrame(shapefile_data)
shapfile_raw.head()

Unnamed: 0,STATE,CWA,COUNTYNAME,FIPS,TIME_ZONE,FE_AREA,LON,LAT,geometry
0,ME,CAR,Washington,23029,E,se,-67.6361,45.0363,"MULTIPOLYGON (((-67.93539 44.40382, -67.93643 ..."
1,GA,CHS,McIntosh,13191,E,se,-81.2646,31.5329,"MULTIPOLYGON (((-81.46814 31.33980, -81.46747 ..."
2,GA,CHS,Liberty,13179,E,se,-81.2103,31.7093,"POLYGON ((-81.30807 31.79454, -81.30546 31.791..."
3,AS,PPG,Swains Island,60040,S,,-171.0459,-11.0843,"POLYGON ((-171.04049 -11.08245, -171.03940 -11..."
4,AS,PPG,Manu'a,60020,S,,-169.506,-14.2219,"MULTIPOLYGON (((-169.42766 -14.21181, -169.427..."


## Merge the two datasets 

Once you’ve extracted the longitude and latitudes, we simply merge the coordinate dictionary with the dataset using state and county variables (which should be present in both datasets). In this case, we’ll merge the data using State abbreviations and County names:

In [9]:
#Now we are going to merge the two datasets by the column STATE. 
df_2 = df.merge(shapfile_raw.rename(columns={'STATE':'State'}),how='outer')
df_2.sample(3)

Unnamed: 0,Victim's name,Victim's age,Victim's gender,Victim's race,Date of Incident (month/day/year),Street Address of Incident,City,State,Zipcode,County,...,Encounter Type (DRAFT),datetime,CWA,COUNTYNAME,FIPS,TIME_ZONE,FE_AREA,LON,LAT,geometry
751498,Roy Jacobs Jr.,48,Male,White,01/06/2013,4100 N. McDonald Road,Spokane,WA,99216.0,Spokane,...,,01/06/2013 0:00,SEW,Thurston,53067,P,wc,-122.8314,46.9282,"MULTIPOLYGON (((-122.85231 47.29516, -122.8551..."
299693,Jose Baca-Olivares,58,Male,Hispanic,31/07/2019,2700 of San Luis Street,San Antonio,TX,78207.0,Bexar,...,Domestic disturbance,31/07/2019 0:00,CRP,La Salle,48283,C,sc,-99.0997,28.3451,"POLYGON ((-98.80080 28.64751, -98.80119 28.638..."
378335,John Michael Brisco,52,Male,Black,09/06/2016,3700 Hwy 365,Port Arthur,TX,77642.0,Jefferson,...,,09/06/2016 0:00,FWD,Montague,48337,C,nc,-97.7246,33.6757,"POLYGON ((-97.66800 33.99081, -97.66620 33.990..."


In [10]:
# remove some columns that we don`t need to the analysis. 
df_2.drop(["Victim's race", "Street Address of Incident", "Date of Incident (month/day/year)", "geometry", "Zipcode", "CWA", "Symptoms of mental illness?", "Armed/Unarmed Status","FIPS", "TIME_ZONE", "FE_AREA" ], axis = 1, inplace = True)

In [11]:
# There are some null values that we are not interested in, so we are going to eliminate them 
df_2.dropna(how='any', inplace = True)

In [12]:
# Also, we remove the duplicate data
df_3 = df_2.drop_duplicates(subset=["Victim's name"])

In [13]:
df_3.tail()

Unnamed: 0,Victim's name,Victim's age,Victim's gender,City,State,County,Cause of death,Alleged Weapon,Alleged Threat Level,Fleeing,Geography,MPV ID,Encounter Type (DRAFT),datetime,COUNTYNAME,LON,LAT
788922,Jeremy Potwin,39,Male,Tunbridge,VT,Orange,Gunshot,gun,attack,Not fleeing,Rural,6928.0,Other Offense Type,11/05/2019 0:00,Windsor,-72.5855,43.5798
788950,Benjamin Gregware,42,Male,Bolton,VT,Washington,Gunshot,gun,other,Not fleeing,Rural,5542.0,Traffic Stop,11/02/2018 0:00,Windsor,-72.5855,43.5798
788964,Nathan Giffin,32,Male,Montpelier,VT,Washington,Gunshot,gun,undetermined,Foot,Rural,5453.0,Part 1 Violent Crime,16/01/2018 0:00,Windsor,-72.5855,43.5798
788978,Michael Battles,32,Male,Poultney,VT,Rutland,Gunshot,toy weapon,attack,Not fleeing,Rural,5059.0,Part 1 Violent Crime,01/09/2017 0:00,Windsor,-72.5855,43.5798
789067,Joseph J. Santos,32,Male,Providence,RI,Providence,Gunshot,vehicle,attack,Car,Urban,5258.0,Part 1 Violent Crime,09/11/2017 0:00,Newport,-71.2366,41.556


# Create a map 

In [17]:
# Set the map configuration
map_3 = KeplerGl(height=800, data={"attacks": df_3})

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


In [18]:
#View the map
map_3

KeplerGl(data={'attacks':                  Victim's name Victim's age Victim's gender            City  \
0    …

In [16]:
D3_config = map_3.config

In [23]:
# Save map_1 config to a file
with open('hex_config_3D.py', 'w') as f:
    f.write('config = {}'.format(map_3.config))

In [22]:
map_3.save_to_html(data={'3D_temporal': df_3}, config= D3_config, file_name='3d_map.html')

Map saved to 3d_map.html!
