<img src = "https://upload.wikimedia.org/wikipedia/commons/3/32/Earth_rotation.gif" align = right width = 150 height = 450><a id=1></a>
<h2> Mapping in Python</h2>
<h1 align = left> A project in heatmapping NYC arrest data using Python...</h1>
<h3> Amelia Ingram </h3><br>

This project was inspired by a couple [sources](#7):  first, a GeoPython course by Henrikki Tenkanen and George Pipis' blog post on creating heatmaps.  Just for a frame of reference, I just finished my first Python course and won't be able to take data visualization until next year, but wanted to attempt to learn some mapping in the meanwhile with some NYC open data.  (By the way, this is also my first attempt with going "off road" out of a guided course).  The following is the result of my exploration of this technique...

In this project, I needed to use the folium package in order to create "leaflet visualization" (aka maps) that come with their own maps.  

In [1]:
#import packages
import streamlit as st
from streamlit_folium import folium_static
import folium as fm


## Part 1:  Generate a base map

In the first part of this project, I wanted to practice generating a base map of New York City using the Folium package. In order to locate the map, I needed to go to OpenStreetMaps to find the geolocator starting points to produce the map.  

In [2]:
# Create a Map instance
#from Tenkanen's course --won't use folium_static for some reason
m = fm.Map(location=[40.6973,-74.1515], tiles='Stamen Toner',
                   zoom_start=10, control_scale=True)
print(m)

<folium.folium.Map object at 0x7f8a288de4f0>


In [3]:
# Let's change the basemap style (tile) to 'Stamen Toner'
#from Tenkanen's course
m2 = fm.Map(location=[40.6973,-74.1515], tiles='Stamen Toner',
                zoom_start=12, control_scale=True, prefer_canvas=True)

This saved the map locally to my repository as "base_map2" in an html file that I can use to repost, modify, and add layers to later.  Now on to the second half of the project--creating a heatmap using Geopandas. 

## Part 2:  Geopandas and layers
Now, I want to try to use Geopandas (see Pipis' project page) to create a heatmap.  On Pipi's page, he creates a heatmap of arrest data from the Baltimore police open dataset.  Instead, I wanted to try my hand at this map using NYC Open Data.  I just needed to find similar columns of arrests, geolocation data, and some demographic identifiers (I kept his use of age and race...for now). 

In [4]:
import geopandas as gp
from folium.plugins import HeatMap
import pandas as pd
import numpy as np

I downloaded the dataset from the NYC Open Data website (data.cityofnewyork.us).  In order to do that, I needed to set up a free Socrata account that allowed me to freely view and download datasets.  I selected the "NYC Arrest Data (Year to Date)" set that was last updated on October 19, 2021.  This included the most recent data.  It was a sizable file (24.4 MB) but not impossible to use.
(Source:  https://data.cityofnewyork.us/Public-Safety/NYPD-Arrest-Data-Year-to-Date-/uip8-fykc )

In [11]:
# load the data
path = 'https://data.cityofnewyork.us/resource/uip8-fykc.csv'

arrest_table = pd.read_csv(path, header=0)            # read data from online!

Once the appropriate packages are loaded (including the Heatmap plugin that is part of Folium) and I have the dataset, I wanted to peek under the hood of the NYC Arrest dataset to see what columns are available...

In [12]:
#checking out the variables in the dataset
print(arrest_table.head())

   arrest_key              arrest_date  pd_cd                   pd_desc  \
0   234233843  2021-09-29T00:00:00.000  105.0         STRANGULATION 1ST   
1   234129823  2021-09-27T00:00:00.000  157.0                    RAPE 1   
2   234040747  2021-09-25T00:00:00.000  109.0  ASSAULT 2,1,UNCLASSIFIED   
3   234047720  2021-09-25T00:00:00.000  101.0                 ASSAULT 3   
4   234042526  2021-09-25T00:00:00.000  101.0                 ASSAULT 3   

   ky_cd                     ofns_desc    law_code law_cat_cd arrest_boro  \
0  106.0                FELONY ASSAULT  PL 1211200          F           B   
1  104.0                          RAPE  PL 1303501          F           K   
2  106.0                FELONY ASSAULT  PL 1200501          F           Q   
3  344.0  ASSAULT 3 & RELATED OFFENSES  PL 1200001          M           B   
4  344.0  ASSAULT 3 & RELATED OFFENSES  PL 1200001          M           B   

   arrest_precinct  jurisdiction_code age_group perp_sex perp_race  \
0               

I can now see how the columns are labeled and have a sense of all of the geographic and demographic data that is available to me.  Now I need to determine the starting geopoints for the map.  I am strictly following Pipis' code here, but basically he pulls the length of the locations to find the eighth point from the list.  Whatever, it's a starting point...probably didn't need to do all that, but just following the exercise...

In [14]:
#pull locations from the table to determine the starting geopoint
locations = arrest_table[['latitude', 'longitude']]
locationlist = locations.values.tolist()
len(locationlist)
locationlist[7]

[40.81038342800008, -73.90452841699994]

In [15]:
# start the base map of New York using the latitude and longitude from locations
map_osm = fm.Map(location=[40.81, -73.90], zoom_start=11)
map_osm

In [25]:
#Now we will work with the Race and take a sample of 1K observations.
arrest_table = arrest_table.sample(n=1000, replace=False, random_state=1)
arrest_table.dropna(subset=['PERP_RACE'],inplace=True)

#We will create a function to produce a heatmap based on race
def race_col(x):
  if x=='BLACK':
    return 'black'
  if x=='WHITE':
    return 'blue'
  if x=='ASIAN':
    return 'yellow'
  if x=='HISPANIC':
    return 'red'
  else:
    return 'green'
arrest_table['color_race'] = arrest_table['PERP_RACE'].apply(lambda x:race_col(x) )

# Create the Heat Map
#Make the list of Lat an Long
lat = arrest_table.Latitude.tolist()
lng = arrest_table.Longitude.tolist()
HeatMap(list(zip(lat, lng))).add_to(map_osm)

# Add the Circles with the corresponding races based on the color
arrest_table.apply(lambda x:fm.Circle(location=[x['Latitude'], x['Longitude']], radius=50, fill=True, color=x['color_race'], popup=x['AGE_GROUP']).add_to(map_osm), axis=1)
map_osm

In [26]:
#lets save the output into an html page
# Filepath to the output
outfp3 = "amelia-ingram/hello-world/blob/gh-pages/heatmap.html"

# Save the map
map_osm.save(outfp3)

<h3>Success!</h3>  
The result generated a heatmap of a 1000 person sample of arrests, sorted by race.  The individual points are arrests that are color coded by race, and will pop up with age. 
<li><a>https://github.com/amelia-ingram/hello-world/blob/gh-pages/heatmap.html</a></li>
<p>I still have a lot to learn on this technique (and mapping in general)...but I made some progress and it was a lot of fun!</p>  


<h3>Sources Used in this Project:</h3><a id=7></a>
NYC Open Data.  <a>https://opendata.cityofnewyork.us/</a><br>
Pipis, George.  2021.  How to Make Interactive Maps with Folium.  Python-bloggers.com <a>https://python-bloggers.com/2021/06/how-to-make-interactive-maps-with-folium/ </a><br>
Silva, George.  2017.  Mapping Points with Folium.  <a>https://georgetsilva.github.io/posts/mapping-points-with-folium/</a><br>
Tenkanen, Henrikki.  2016. GeoPython and ArcGIS. Fall 2016 course materials.  University of Helsinki.  <a>https://automating-gis-processes.github.io/2016/course-info.html#</a> <br>

[Back to top](#1)