# San Francisco Crime Choropleth, Marker, and Cluster Maps

In our last post, we created FiveThirtyEight-style visuals and choropleth maps after cleaning and wrangling data from an official UN data set on U.S. immigration (https://crawstat.com/2020/06/26/fivethirtyeight-style-visuals-and-choropleth-maps-for-u-s-immigration/). Today, we'll work with San Francisco crime data of over 150,000 crime incidents in 2016 to create interactive choropleth, marker, and cluster maps with pop-up labels using folium. Let's dive in. 

## Part 1: Choropleth Map

In [50]:
# Import relevant libraries
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

!pip install folium
import folium
from folium import plugins



In [51]:
# Read in data. Let's set the index column to zero ("IncidentNum")
sf = pd.read_csv("https://cocl.us/sanfran_crime_dataset", index_col = 0)
sf.head()

Unnamed: 0_level_0,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId
IncidntNum,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
120058272,WEAPON LAWS,POSS OF PROHIBITED WEAPON,Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212120
120058272,WEAPON LAWS,"FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE",Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212168
141059263,WARRANTS,WARRANT ARREST,Monday,04/25/2016 12:00:00 AM,14:59,BAYVIEW,"ARREST, BOOKED",KEITH ST / SHAFTER AV,-122.388856,37.729981,"(37.7299809672996, -122.388856204292)",14105926363010
160013662,NON-CRIMINAL,LOST PROPERTY,Tuesday,01/05/2016 12:00:00 AM,23:50,TENDERLOIN,NONE,JONES ST / OFARRELL ST,-122.412971,37.785788,"(37.7857883766888, -122.412970537591)",16001366271000
160002740,NON-CRIMINAL,LOST PROPERTY,Friday,01/01/2016 12:00:00 AM,00:30,MISSION,NONE,16TH ST / MISSION ST,-122.419672,37.76505,"(37.7650501214668, -122.419671780296)",16000274071000


In [52]:
# Check dimensions of dataframe. We have 150,500 crime incidents and 12 variables
sf.shape

(150500, 12)

In [53]:
# Group data by neighborhood using groupby and set the values to the count of all incidents (for our choropleth map)
sf1 = sf.groupby("PdDistrict").count()
sf1

Unnamed: 0_level_0,Category,Descript,DayOfWeek,Date,Time,Resolution,Address,X,Y,Location,PdId
PdDistrict,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
BAYVIEW,14303,14303,14303,14303,14303,14303,14303,14303,14303,14303,14303
CENTRAL,17666,17666,17666,17666,17666,17666,17666,17666,17666,17666,17666
INGLESIDE,11594,11594,11594,11594,11594,11594,11594,11594,11594,11594,11594
MISSION,19503,19503,19503,19503,19503,19503,19503,19503,19503,19503,19503
NORTHERN,20100,20100,20100,20100,20100,20100,20100,20100,20100,20100,20100
PARK,8699,8699,8699,8699,8699,8699,8699,8699,8699,8699,8699
RICHMOND,8922,8922,8922,8922,8922,8922,8922,8922,8922,8922,8922
SOUTHERN,28445,28445,28445,28445,28445,28445,28445,28445,28445,28445,28445
TARAVAL,11325,11325,11325,11325,11325,11325,11325,11325,11325,11325,11325
TENDERLOIN,9942,9942,9942,9942,9942,9942,9942,9942,9942,9942,9942


In [54]:
# We only need the 'Category' column. Let's also rename the column headers
sf1 = pd.DataFrame(sf1,columns=['Category'])  
sf1.reset_index(inplace=True)   # default index, otherwise groupby column becomes index
sf1.rename(columns={'PdDistrict':'Neighborhood','Category':'Count'}, inplace=True)
sf1

Unnamed: 0,Neighborhood,Count
0,BAYVIEW,14303
1,CENTRAL,17666
2,INGLESIDE,11594
3,MISSION,19503
4,NORTHERN,20100
5,PARK,8699
6,RICHMOND,8922
7,SOUTHERN,28445
8,TARAVAL,11325
9,TENDERLOIN,9942


In [55]:
# San Francisco latitude and longitude values
lat = 37.77
long = -122.42

In [56]:
# Create map using folium
sf1_map = folium.Map(location=[lat, long], zoom_start=12)

# display the map of San Francisco
sf1_map

In [57]:
# San Francisco geojson file for choropleth map
sf_geo = "https://cocl.us/sanfran_geojson"

In [58]:
# Plot choropleth map. Neighborhoods map to feature.properties.DISTRICT in the geojson file. We are focusing on counts of crimes
sf1_map.choropleth(
       geo_data=sf_geo,
       data=sf1,
       columns=['Neighborhood','Count'],
       key_on='feature.properties.DISTRICT',
       fill_color='YlOrRd',
       fill_opacity='0.7',
       line_opacity='0.3',
       legend_name='Crime Rate in San Francisco, by Neighborhood')

# Display choropleth map
sf1_map

## Part 2: Marker Maps

In [59]:
# Limit our dataframe to the first 500 crimes to make computing more manageable
sf2 = sf.iloc[0:500, :]
sf2.shape

(500, 12)

In [60]:
# Create map using folium
sf2_map = folium.Map(location=[lat, long], zoom_start=12)

# display the map of San Francisco
sf2_map

In [61]:
# Instantiate feature group for crime incidents
incidents = folium.map.FeatureGroup()

# Loop through 500 crimes and add each to the incidents feature group. In our dataframe, the variables 'X' and 'Y' refer to the lat and long of the incident
for lat, long, in zip(sf2.Y, sf2.X):
    incidents.add_child(
        folium.CircleMarker(
            [lat, long],
            radius=5, 
            color='yellow',
            fill=True,
            fill_color='blue',
            fill_opacity=0.7
        )
    )

# Add pop-up text to each marker on the map
latitudes = list(sf2.Y)
longitudes = list(sf2.X)
labels = list(sf2.Category)

for lat, long, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, long], popup=label).add_to(sf2_map)    
    
# Add incidents to map
sf2_map.add_child(incidents)

In [62]:
# If map above too crowded for you, can add pop-up labels directly to circle

# Create fresh map again 
sf2_map = folium.Map(location=[lat, long], zoom_start=12)

# Instantiate feature group for crime incidents
incidents = folium.map.FeatureGroup()

# Loop through 1000 crimes and add each to the incidents feature group. In our dataframe, the variables 'X' and 'Y' refer to the lat and long of the incident
for lat, long, label in zip(sf2.Y, sf2.X, sf2.Category):
    folium.CircleMarker(
        [lat, long],
        radius=5, 
        color='yellow',
        fill=True,
        popup=label,
        fill_color='blue',
        fill_opacity=0.7
    ).add_to(sf2_map)
    
sf2_map

## Part 3: Cluster Maps

In [63]:
# Create fresh map again 
sf2_map = folium.Map(location=[lat, long], zoom_start=13)

# Instantiate marker cluster object for crime incidents
incidents = plugins.MarkerCluster().add_to(sf2_map)

# Loop through 500 crimes and add each to the marker cluster object. In our dataframe, the variables 'X' and 'Y' refer to the lat and long of the incident
for lat, long, label, in zip(sf2.Y, sf2.X, sf2.Category):
    folium.Marker(
        location=[lat, long],
        icon=None,
        popup=label,
    ).add_to(incidents)

# display map
sf2_map