 # **Visualising San Fransisco Crime Data with Folium**

**Dataset used:**

[San Francisco Police Department Incidents for the year 2021](https://data.sfgov.org/Public-Safety/Map-of-Police-Department-Incident-Reports-2018-to-/jq29-s5wp) - Incidents from San Francisco public data portal. Dataset consists of incidents from 2018, however only incidents in 2021 have been filtered out and visualised.

In [1]:
# importing modules
import numpy as np
import pandas as pd
!pip3 install folium==0.5.0
import folium



In [2]:
df_incidents = pd.read_csv(r"C:\Users\karsh\Documents\Python Notebooks\Datasets\Police_Department_Incident_Reports_2021.csv")

In [3]:
df_incidents.head()

Unnamed: 0,Incident Datetime,Incident Date,Incident Time,Incident Year,Incident Day of Week,Report Datetime,Row ID,Incident ID,Incident Number,CAD Number,...,Longitude,Point,Neighborhoods,ESNCAG - Boundary File,Central Market/Tenderloin Boundary Polygon - Updated,Civic Center Harm Reduction Project Boundary,HSOC Zones as of 2018-06-05,Invest In Neighborhoods (IIN) Areas,Current Supervisor Districts,Current Police Districts
0,07-10-2021 03:51,07-10-2021,03:51,2021,Thursday,07-10-2021 03:55,108000000000.0,1077944,210321909,,...,,,,,,,,,,
1,12-05-2021 14:30,12-05-2021,14:30,2021,Wednesday,12-05-2021 16:12,103000000000.0,1030146,216053712,,...,,,,,,,,,,
2,03-10-2021 21:00,03-10-2021,21:00,2021,Sunday,04-10-2021 04:07,108000000000.0,1078063,216142581,,...,,,,,,,,,,
3,11-05-2021 15:00,11-05-2021,15:00,2021,Tuesday,12-05-2021 14:09,103000000000.0,1030182,216054152,,...,,,,,,,,,,
4,13-09-2021 08:00,13-09-2021,08:00,2021,Monday,07-10-2021 13:44,108000000000.0,1078220,210653752,212801811.0,...,-122.464145,POINT (-122.46414497098554 37.779090726308574),5.0,,,,,,4.0,8.0


In [4]:
list(df_incidents.columns)

['Incident Datetime',
 'Incident Date',
 'Incident Time',
 'Incident Year',
 'Incident Day of Week',
 'Report Datetime',
 'Row ID',
 'Incident ID',
 'Incident Number',
 'CAD Number',
 'Report Type Code',
 'Report Type Description',
 'Filed Online',
 'Incident Code',
 'Incident Category',
 'Incident Subcategory',
 'Incident Description',
 'Resolution',
 'Intersection',
 'CNN',
 'Police District',
 'Analysis Neighborhood',
 'Supervisor District',
 'Latitude',
 'Longitude',
 'Point',
 'Neighborhoods',
 'ESNCAG - Boundary File',
 'Central Market/Tenderloin Boundary Polygon - Updated',
 'Civic Center Harm Reduction Project Boundary',
 'HSOC Zones as of 2018-06-05',
 'Invest In Neighborhoods (IIN) Areas',
 'Current Supervisor Districts',
 'Current Police Districts']

Since the dataset has a multiple columns which are not required for visualisation, the inessential columns are filtered out. Additionally, entries with no latitude or longitude values are also filtered out.

In [5]:
df_incidents = df_incidents[['Incident ID','Incident Category', 'Incident Subcategory', 'Incident Description','Latitude', 'Longitude']]
df_incidents.dropna(subset=["Latitude","Longitude","Incident Category"],inplace=True)
df_incidents.shape

(121618, 6)

Resetting the index and dropping the additional index column which gets added by default

In [6]:
df_incidents.reset_index(inplace=True)

In [7]:
df_incidents.drop(columns=["index"], axis=1, inplace=True)
df_incidents.head()

Unnamed: 0,Incident ID,Incident Category,Incident Subcategory,Incident Description,Latitude,Longitude
0,1078220,Lost Property,Lost Property,Lost Property,37.779091,-122.464145
1,1030069,Robbery,Robbery - Other,"Robbery, W/ Gun",37.71543,-122.4418
2,1030335,Suspicious Occ,Suspicious Occ,Suspicious Occurrence,37.79323,-122.393181
3,1030322,Other Miscellaneous,Trespass,Trespassing,37.771296,-122.405425
4,1030299,Motor Vehicle Theft,Motor Vehicle Theft,"Vehicle, Recovered, Stolen outside SF",37.743124,-122.403275


In [8]:
#San Fransisco Coordinates
sf_latitude = 37.7749
sf_longitude = -122.4194

Creating a plain map of San Fransisco using folium with it's coordinates. Feel free to zoom in/zoom out and explore!

In [9]:
SF_map = folium.Map(location=[sf_latitude, sf_longitude],zoom_start=13)
SF_map

In [10]:
#importing widgets and display modules

import ipywidgets as widgets
from IPython.display import display
from IPython.display import clear_output
import webbrowser
from folium import plugins

In [11]:
#Map generating function
def generate_map(selected_categories):
    print("Generating Map of",list(selected_categories))
    SF_map = folium.Map(location=[sf_latitude, sf_longitude],zoom_start=12)
    incidents = plugins.MarkerCluster().add_to(SF_map)    
    temp_df_incidents = df_incidents[df_incidents["Incident Category"].isin(selected_categories)]
    for lat, lng, label, in zip(temp_df_incidents.Latitude, temp_df_incidents.Longitude, temp_df_incidents["Incident Category"]):
        folium.Marker(
            location=[lat, lng],
            icon=None,
            popup=folium.Popup(label),
        ).add_to(incidents)
    print("Map Generated")    
    return SF_map

In [13]:
incident_map="No Option Selected"
selected_categories = ""

#On change function for the multi-select
def ms_on_change(selection):
    global selected_categories
    if selection['type'] == 'change' and selection['name'] == 'value':        
        selected_categories = selection["new"]    
        
#On click funtion for Generate Map button        
def on_button_click(button):
    global selected_categories    
    if len(selected_categories) > 0:
        incident_map = generate_map(selected_categories)        
        print("Opening map...")
        output_file = "incident_map.html"
        incident_map.save(output_file)
        webbrowser.open(output_file, new=2)  # open in new tab
    else:
        print("No category selected")


multiSelect.observe(ms_on_change)
generate_button.on_click(on_button_click)

A dropdown of the categories are created, feel free to play around and choose the categories of incidents you want to see visualised! Press and hold the shift key to select multiple and render a map of multiple categories!

PS: Some categories have a very high count (Larceny Theft: 34000+), they may take time to generate and render (even hang your notebook in extreme cases!). All other category counts are less than 10000, so should render efficiently. Be careful while selecting multiple categories.

In [None]:
print("Select incident category/categories")
multiSelect = widgets.SelectMultiple(
    options=df_incidents["Incident Category"].unique(),
    rows=10,    
    disabled=False
)
display(multiSelect)

generate_button = widgets.Button(description="Generate Map")

display(generate_button)

Select the categories and click generate! The map will open in a separate tab, happy exploring!