Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Pull request Compare This branch is even with ladyson:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Project for Machine Learning for Public Policy, Spring 2016

A project to evaluate the predictive power of complainant demographic data for predicting the outcome of police misconduct complaints in Chicago.


The original dataset was obtained through the Freedom of Information Act (FOIA) by the Invisible Institute, an investigative journalism nonprofit based in Chicago. The dataset contains 56,000 misconduct complaint records for approximately 8,500 Chicago police officers since 2011. In total, we dropped 37,689 observations from the original dataset, decreasing the total number of observations in our base dataset from 56,384 to 18,695. We kept observations that fell within March 2011 - December 2014 as those are the most reliable. We added 311 and crime data from the City of Chicago open data portal, ACS data, and Tiger Census Shapefiles to generate additional features.

Data Pipeline

  • Use scripts in db_tools to upload data from CSV to PostGreSQL database
  • Use scripts in features and geocoding to generate and save features
  • Use scripts in final_data to export joined tables to CSV for use with SKLearn
  • Run experiments using
  • Evaluate results, output metrics, and generate plots using results


  • Python 3.4
  • argparse
  • pickle
  • json
  • itertools
  • csv
  • matplotlib
  • pandas
  • numpy
  • scipy
  • pydoc
  • UnbalancedDataset


Thanks to the DSSG Cincinnati team for providing some useful code snippets for our pipeline and to the Invisible Institute for providing additional background and context for the data

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.