Note: This repository reflects the work completed during the summer of 2015. To see the continuation of the project, please visit the current repo.
About
First settled in 1788, Cincinnati is one of the oldest American cities west of the original colonies. Today, the city struggles with aging home stock, stifling economic redevelopment in some neighborhoods.
DSSG is working with the City of Cincinnati to identify properties at risk of code violations or abandonment. We hope that early intervention strategies can prevent further damage and stimulate neighborhood revitalization. Read more about our project here.
Getting started
Get the code
git clone https://github.com/dssg/cincinnati2015-public.git
cd cincinnati2015-public
Install all pre-requisites
conda create -n "cincinnati" --yes --file requirements.conda python=2.7
source activate cincinnati
Configure database
cp dbconfig.sample dbconfig.py
update database configuration in dbconfig.py
Load data into postgres
... see the etl directory
Create features from the data
... see the blight_risk_prediction directory
Run the modeling pipeline
Create output directories
mkdir results
mkdir predictions
Configure the model
edit default.yaml (options are documented in default.yaml)
Run the model
python -m blight_risk_prediction/model
Output
Each model run produces a pickle file which contains:
- the full list of parcels predicted to have violations
- the configuration file used to generate that model
- feature importances
These output files include a timestamp in their filename such that they will not be accidentally overwritten. These files can be used with the evaluation web application in evaluation
.
Repository layout
- blight_risk_prediction - our modeling pipeline
- docs - some additional documentation
- etl - scripts for loading the Cincinnati datasets into a postgres database
- evaluation - webapp we use for comparing different models
- postprocess - add details (e.g. address) about properties to predictions
- targeting_priority - re-rank predictions according to some targeting priority
- test - unit tests