# [SOC-88] Mapping Crime in San Francisco

### Professor David Harding

## Table of Contents

[Introduction](#intro)

[The Data](#data)

[Base Maps](#1)
   - [Question 1](#q1)

[Markers](#2)
   - [Question 2](#q2)
   - [Question 3](#q3)
   - [Question 4](#q4)
    
[Choropleth Maps](#3)
   - [Question 5](#q5)
   - [Question 6](#q6)



## Introduction <a id='intro'></a>

In this homework, you will practice different data mapping techniques you learned about in lecture and lab. The data has been taken from [SF Data](https://data.sfgov.org/), San Francisco's open data site. 

There are two main data files used in this assignment: **SFPD_incidents_2016.csv** and **sfpd-police-districts.geojson**. 

The first file, **SFPD_incidents_2016.csv**, has records of all police incidents that took place in 2016. Its columns contain information such as the latitude-longitude information of incidents, police precinct and neighborhood in which the incident occurred, time and date of the report, type of crime, etc. 

The second file, **sfpd-police-districts.geojson**, contains geographic information about the boundaries of San Francisco police districts. These boundaries are necessary for making choropleth plots.

---


We will begin by running a code cell that will load the libraries you'll be using.

In [None]:
# load the necessary software
from datascience import *
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import datetime
import folium
import json

## The Data <a id='data'></a>


Our main dataset comes from the [city of San Francisco's open data portal](https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-Historical-2003/tmnf-yvry). 

Run the next cell to load the incident data.

In [None]:
# load SF Police Incident Data, 2016
incidents = Table().read_table('data/SFPD_incidents_2016.csv')
incidents.show(5)

Each row in this table represents a different incident reported to the San Francisco Police Department (SFPD). Most of the columns are fairly intuitive, but we'll break down a few of particular interest:

- `IncidntNum`and `PdId` are kinds of identifiers for each incident, used for organization within the police department.

- `Category` classifies the incident as on of 37 types. We can see all possible categories using the `group` method.

In [None]:
# show the unique categories
incidents.group("Category").show()

- `Descript` gives more information on what occurred during the incident. You can think of the `Descript` as a subtype of the category in `Category`. There are too many unique descriptions to list them all; it makes more sense to select a particular category and then list the possible descriptions for only that category. In the next cell, you can view the possible `Descript` values for incidents falling under the category `"NON-CRIMINAL"`.

In [None]:
# show the unique incident descriptions for the NON-CRIMINAL category
incidents.where("Category", "NON-CRIMINAL").group('Descript').show()

- `Resolution` gives information about what the police did about the incident. Once again, we can view all possible resolution options using the `group` method

In [None]:
# show the unique resolutions
incidents.group("Resolution").show()

- Finally, `X`, `Y`, and `Location` give geographic data about the incident. `X` represents the longitude, `Y` represents the latitude, and `Location` has both the latitude and longitude together.

## Base Maps <a id='1'></a>

### Question 1 <a id='q1'></a>
Create a base map centered on San Francisco. Choose an appropriate zoom start and tiles.

*Note: we're going to be creating several maps in this homework, so it's easier to create variables for the starting coordinates, zoom, and tiles, and use them over and over again, rather than rewrite them in every map we make.*

In [None]:
# add coordinates for San Francisco
sf_coordinates = [..., ...]
sf_zoom_start = ...
sf_tiles = ...

# create a map of San Francisco
sf_map = folium.Map(location=sf_coordinates, zoom_start=sf_zoom_start, tiles=sf_tiles)
sf_map

## Markers <a id='2'></a>

In the next two cells, we've isolated two police incidents for you.

In [None]:
incidentA = incidents.where("PdId", 16009503010045)
incidentA

In [None]:
incidentB = incidents.where("PdId", 16014933626039)
incidentB

### Question 2 <a id='q2'></a>
Create a marker for each of the above incidents, including:
* incident location
* an appropriate and informative pop-up (appears when you hover over the marker)
* an appropriate and informative tooltip (appears when you click on the marker)
* an appropriate color and type for the icon, given the type of incident

Hint: you can view the list of icon options at https://getbootstrap.com/docs/3.3/components/

In [None]:
# a clean map for the markers
marker_map = folium.Map(location=sf_coordinates, zoom_start=sf_zoom_start, tiles=sf_tiles)

# For Incident A
coordinateA = [..., ...]
popupA = ...
tooltipA = ...
folium.Marker(location=coordinateA, popup=popupA, 
              tooltip=tooltipA, icon=folium.Icon(color=..., icon=...)).add_to(marker_map)

# view the map
marker_map

In [None]:
# For Incident B
coordinateB = [..., ...]
popupB = ...
tooltipB = ...
folium.Marker(location=coordinateB, popup=popupB, 
              tooltip=tooltipB, icon=folium.Icon(color=..., icon=...)).add_to(marker_map)
marker_map

Next, we'd like to map all incidences of disorderly conduct.

First, we make a table that only contains disorderly conduct incidents.

In [None]:
# filter for just disorderly conduct
disorderly = incidents.where("Category", "DISORDERLY CONDUCT")
disorderly.show(3)

### Question 3 <a id='q3'></a>
Fill in the code below to create markers for all disorderly conduct incidents.

As in question 2, choose the appropriate coordinates, popup, tooltip, color, and icon for the type of incident.

In [None]:
# create a clean map for the disorderly conduct incidents
disorderly_map = folium.Map(location=sf_coordinates, tiles=sf_tiles, zoom_start=sf_zoom_start)

# make a marker for each disorderly conduct incident
for i in range(disorderly.num_rows):
    incidentC = disorderly.take(i)
    coordinateC = [..., ...]
    popupC = ...
    tooltipC = ...
    folium.Marker(location=coordinateC, popup=popupC, 
                  tooltip=tooltipC, icon=folium.Icon(color=..., icon=...)).add_to(disorderly_map)
    
# show the map
disorderly_map

### Question 4 <a id='q4'></a>

Describe the features you chose for questions 1, 2, and 3, including:
* map tiles
* marker icon
* marker color
* marker popup and tooltip

Why were those features good for the data in those questions?

*Replace this line with your answer*

## Choropleth maps <a id='3'></a>

In this section, you're going to create a choropleth map with the number of non-criminal mental health related incidents in each district.

First, we need to load the geojson file that gives the boundaries for the police districts. Run the next cell to load the geojson.

In [None]:
# load SFPD district boundaries
sf_districts = json.load(open('data/sf-police-districts.geojson'))
sf_districts

You can see the districts overlaid onto the San Francisco Map by running the next cell.

In [None]:
# make the folium geojson object and add to a map of SF
m = folium.Map(sf_coordinates, zoom_start=sf_zoom_start, tiles=sf_tiles)

folium.GeoJson(
    sf_districts,
    style_function=lambda feature: {
        'fillColor': 'white',
        'color': 'blue',
        'weight': 2,
        'dashArray': '5, 5'
    }
).add_to(m)
m

To make our choropleth overlay, we must first get the counts of mental health incidents by district. First, we use the `where` method to select only the incidents that have a description of "AIDED CASE, MENTAL DISTURBED".

In [None]:
# mental health related incidents
mental_health = incidents.where("Descript", are.equal_to("AIDED CASE, MENTAL DISTURBED"))
mental_health.show(3)

Next, we use `group` to get the counts of mental health incidents per police district.

In [None]:
# get the counts of mental health incidents by district
mental_health_by_district = mental_health.group("PdDistrict")
mental_health_by_district

And finally, we convert the counts to a DataFrame so that it works with the Folium software.

In [None]:
# convert to DataFrame
mental_health_df = mental_health_by_district.to_df()
mental_health_df.head()

### Question 5 <a id='q5'></a>
Complete the following code to create a choropleth overlay showing the counts of mental health incidents for each police district. Choose an appropriate and informative:
* fill color (using a colormap)
* fill opacity
* legend name

Hint: you can find colormap options at https://matplotlib.org/gallery/color/colormap_reference.html

In [None]:
# create a clean map for the choropleth
m = folium.Map(sf_coordinates, zoom_start=sf_zoom_start, tiles=sf_tiles)

# create the choropleth overlay
folium.Choropleth(
    geo_data=sf_districts,
    data=mental_health_df,
    columns=['PdDistrict', 'count'],
    key_on='feature.properties.district',
    fill_color=...,
    fill_opacity=...,
    legend_name=...
).add_to(m)
m

### Question 6 <a id='q6'></a>
Explain your design choices for the choropleth map. What options did you consider? What options did you end up choosing, and why? Be sure to reference the context of the data when you explain your choices.

*Replace this line with your response*

----


Data Science Modules: http://data.berkeley.edu/education/modules

Data Science Offerings at Berkeley: https://data.berkeley.edu/academics/undergraduate-programs/data-science-offerings



Notebook developed by: Keeley Takimoto