Blight Prediction Project
The purpose of this project was to determine if any data features of a city can be used to predict blight for a given location.
These datasets were downloaded from the course site, but are also available via capstone project repo on github. All of the data comes from Socrata powered Detroit Data Portal, https://data.detroitmi.gov/.
Data ETL & Preprocessing
Before analysis of the data could be undertaken, all files were initially formatted using Excel PowerQuery for removal of line breaks and standardization of the street number and addresses. Then the data was loaded into FME to validate and standardize the geographic coordinates and create well formatted incident and unique building files. Features were generated within FME workflow of total 311 calls, crimes and blight violations for a building. A Property Values dataset found on the Detroit Data Portal that included appraised and taxed values, sales price, tax status, and whether it had been improved at any point was also joined to the data so those features could be included in the models and also served to limit dataset so that building lacking predictive features were filtered from final data.