<img src = "https://upload.wikimedia.org/wikipedia/commons/3/32/Earth_rotation.gif" align = right width = 150 height = 450><a id=1></a>
<h2> Mapping in Python</h2>
<h1 align = left> A project in heatmapping NYC arrest data using Python...</h1>
<h3> Amelia Ingram </h3><br>

This project was inspired by a couple [sources](#7):  first, a GeoPython course by Henrikki Tenkanen and George Pipis' blog post on creating heatmaps.  Just for a frame of reference, I just finished my first Python course and won't be able to take data visualization until next year, but wanted to explore map visualizations with some NYC open data.  (By the way, this is also my first attempt with going "off road" out of a guided course...so bear with me...I know it's rough).  The following is the result of my exploration of these techniques...

In the first portion of this project, I used the folium package in order to create "leaflet visualization" (aka maps) that come with their own maps.  The second portion of the project is generating a heatmap using NYC Open Data.

In [1]:
#import packages
import streamlit as st
from streamlit_folium import folium_static
import folium as fm


## Part 1:  Generate a base map

In the first part of this project, I wanted to practice generating a base leaflet map of New York City using the Folium package. In order to locate the map, I needed to go to OpenStreetMaps to find the geolocator starting points to produce the map.  This involved opening the OpenStreetMaps site and copying the location data from an approximate "center" to my map (available for use under Open Database License © [OpenStreetMap](https://www.openstreetmap.org/copyright) contributors). 

In [12]:
# Create a Map instance
# Used map style(tile) "stamenterrain" which is one of the styles which is included in Folium.  
#from Tenkanen's course 
m = fm.Map(location=[40.6973,-74.1515], tiles='stamenterrain',
                   zoom_start=10, control_scale=True)
print(m)

<folium.folium.Map object at 0x7fada89da220>


In [14]:
#Save the base map output
outfp = "base_map.html"
m.save(outfp)

from IPython.display import IFrame
IFrame(src='base_map.html', width=600, height=500)


The map generated is one of the default map styles included in the folium package.  There are several other styles that can be used (called "tiles") as well as other controls that can be adjusted to customize the map.  I'll learn more of these later.  (Some of the other styles require either an API key to license the user and/or attribution in the map code...still learning these...).

Let's change the "base map" style to "openstreetmap", which is another free map style available in Folium.  

In [22]:
# Change the map style (tile) to 'openstreetmap'.  
#from Tenkanen's course

m2 = fm.Map(location=[40.6973,-74.1515], tiles='openstreetmap',
                zoom_start=11, attr="<a href=https://server.arcgisonline.com/ArcGIS/rest/services/World_Street_Map/MapServer/tile/{z}/{y}/{x}>Tiles: Esri; Source: Esri, DeLorme, NAVTEQ, USGS, Intermap, iPC, NRCAN, Esri Japan, METI, Esri China (Hong Kong), Esri (Thailand), TomTom, 2012</a>")
print(m2)

<folium.folium.Map object at 0x7fada8f784f0>


In [23]:
#Save the base map output
outfp2 = "base_map2.html"
m2.save(outfp2)

IFrame(src='base_map2.html', width=600, height=500)

This saved the map locally to my repository as "base_map2.html" in a format that I can use to repost, modify, and add layers to later.  

Now on to the second half of the project--creating a heatmap using Geopandas. 

## Part 2:  Geopandas and layers
Now, I want to try to use Geopandas (see Pipis' project page) to create a heatmap.  On Pipi's page, he creates a heatmap of arrest data from the Baltimore police open dataset.  Instead, I wanted to try my hand at this map using NYC Open Data.  I just needed to find similar columns of arrests, geolocation data, and some demographic identifiers (I kept his use of age and race...for now). 

In [18]:
import geopandas as gp
from folium.plugins import HeatMap
import pandas as pd
import numpy as np

I downloaded the dataset from the NYC Open Data website (data.cityofnewyork.us).  In order to do that, I needed to set up a free Socrata account that allowed me to freely view and download datasets.  I selected the "NYC Arrest Data (Year to Date)" set that was last updated on October 19, 2021.  This included the most recent data.  It was a sizable file (24.4 MB) but not impossible to use.
(Source:  https://data.cityofnewyork.us/Public-Safety/NYPD-Arrest-Data-Year-to-Date-/uip8-fykc )

In [19]:
# load the data
path = 'https://data.cityofnewyork.us/resource/uip8-fykc.csv'

arrest_table = pd.read_csv(path, header=0)            # read data from online!

Once the appropriate packages are loaded (including the Heatmap plugin that is part of Folium) and I have the dataset, I wanted to peek under the hood of the NYC Arrest dataset to see what columns are available...

In [20]:
#checking out the variables in the dataset
print(arrest_table.head())

   arrest_key              arrest_date  pd_cd      pd_desc  ky_cd   ofns_desc  \
0   238013474  2021-12-18T00:00:00.000  157.0       RAPE 1  104.0        RAPE   
1   236943583  2021-11-25T00:00:00.000  263.0  ARSON 2,3,4  114.0       ARSON   
2   234938876  2021-10-14T00:00:00.000  594.0  OBSCENITY 1  116.0  SEX CRIMES   
3   234788259  2021-10-11T00:00:00.000  263.0  ARSON 2,3,4  114.0       ARSON   
4   234188790  2021-09-28T00:00:00.000  578.0          NaN    NaN         NaN   

     law_code law_cat_cd arrest_boro  arrest_precinct  jurisdiction_code  \
0  PL 1303501          F           Q              105                 97   
1  PL 1501500          F           K               69                 71   
2  PL 2631100          F           K               61                  0   
3  PL 1501001          F           B               42                 71   
4  PL 2223001          M           B               44                  0   

  age_group perp_sex perp_race  x_coord_cd  y_coord_cd  

I can now see how the columns are labeled and have a sense of all of the geographic and demographic data that is available to me.  Now I need to determine the starting geopoints for the map.  I am strictly following Pipis' code here, but basically he pulls the length of the locations to find the eighth point from the list.  Whatever, it's a starting point...probably didn't need to do all that, but just following the exercise...

In [21]:
#pull locations from the table to determine the starting geopoint
locations = arrest_table[['latitude', 'longitude']]
locationlist = locations.values.tolist()
len(locationlist)
locationlist[7]

[40.816391847000034, -73.89529641399997]

In [27]:
# start the base map of New York using the latitude and longitude from locations
map_osm = fm.Map(location=[40.81, -73.90], zoom_start=11)
map_osm

In [28]:
#Now we will work with the Race and take a sample of 1K observations.
arrest_table = arrest_table.sample(n=1000, replace=False, random_state=1)
arrest_table.dropna(subset=['perp_race'],inplace=True)

#We will create a function to produce a heatmap based on race
def race_col(x):
  if x=='BLACK':
    return 'black'
  if x=='WHITE':
    return 'blue'
  if x=='ASIAN':
    return 'yellow'
  if x=='HISPANIC':
    return 'red'
  else:
    return 'green'
arrest_table['color_race'] = arrest_table['perp_race'].apply(lambda x:race_col(x) )

# Create the Heat Map
#Make the list of Lat an Long
lat = arrest_table.latitude.tolist()
lng = arrest_table.longitude.tolist()
HeatMap(list(zip(lat, lng))).add_to(map_osm)

# Add the Circles with the corresponding races based on the color
arrest_table.apply(lambda x:fm.Circle(location=[x['latitude'], x['longitude']], radius=50, fill=True, color=x['color_race'], popup=x['age_group']).add_to(map_osm), axis=1)
map_osm

In [30]:
#lets save the output into an html page
# Filepath to the output
outfp3 = "heatmap.html" 

# Save the map
map_osm.save(outfp3)
IFrame(src='heatmap.html', width=600, height=500)

<h3>Success!</h3>  
The result generated a heatmap of a 1000 person sample of arrests, sorted by race.  The individual points are arrests that are color coded by race, and will pop up with age. 
<li><a>https://github.com/amelia-ingram/hello-world/blob/gh-pages/heatmap.html</a></li>
<p>I still have a lot to learn on this technique (and mapping in general)...but I made some progress and it was a lot of fun!</p>  


<h3>Sources Used in this Project:</h3><a id=7></a>
NYC Open Data.  <a>https://opendata.cityofnewyork.us/</a><br>
Pipis, George.  2021.  How to Make Interactive Maps with Folium.  Python-bloggers.com <a>https://python-bloggers.com/2021/06/how-to-make-interactive-maps-with-folium/ </a><br>
Silva, George.  2017.  Mapping Points with Folium.  <a>https://georgetsilva.github.io/posts/mapping-points-with-folium/</a><br>
Tenkanen, Henrikki.  2016. GeoPython and ArcGIS. Fall 2016 course materials.  University of Helsinki.  <a>https://automating-gis-processes.github.io/2016/course-info.html#</a> <br>

[Back to top](#1)