Classifying San Francisco Crime Incidents
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
app
data
01_zillow_api_avg_prices_by_zip_code.ipynb
02_bart_caltrain_locations.ipynb
03_police_station_locations.ipynb
04_cannabis_dispensary_locations.ipynb
05_health_care_facilities.ipynb
06_scrape_shelter_website.ipynb
07_data_prep.ipynb
08_exploration of data.ipynb
09_another_exploration_of_data.ipynb
README.md
add_elevation_data.ipynb
d3_screenshot.png
modelling_data.ipynb
run.py

README.md

Classifying Crime in San Francisco

In this repo is data I downloaded from a variety of sources in an attempt to classify the type of San Francisco crime incidents based on the location and time of day a crime occurred.

Data collected

  1. Police incidents recorded between Jan 2016 and April 2017 (category, description, location, police district, date, time)
    a. Assault, shoplifting, theft from cars, drugs, vandalism, robbery, vehicle theft, theft of property
  2. Geospatial data of Zip code boundaries
  3. Zillow house price data
  4. Bart and Caltrain station locations
  5. Police station locations
  6. Medical marijuana dispensary locations
  7. Health care facility locations
  8. Homeless shelter locations

Feature engineering on data collected

  1. Zip code in which each incident occurred
  2. Average house price, median income level and population density in each zip code
  3. Distance to Union Square
  4. Distance to nearest police station, train station, medical marijuana dispensary, healthcare facility, homeless shelter
  5. Total number of nearby medical marijuana dispensaries, train stations, health care facilities, homeless shelters

Final Model: Bagging Classifier (n=300)

Classification Precision Recall F1
Assault 0.42 0.48 0.45
Shoplifting 0.50 0.47 0.48
Theft from auto 0.57 0.71 0.63
Drugs/narcotic 0.55 0.50 0.52
Vandalism 0.21 0.14 0.17
Robbery 0.15 0.09 0.11
Vehicle theft 0.33 0.25 0.28
Theft of property 0.31 0.22 0.26
Avg / total 0.43 0.46 0.44

Visualization of Data

Inside the repo is a D3 visualization of the data contained within a Flask app.

SF Crime