# Identify vacant buildings from satellite images

**Problem Statement:** Abandoned and vacant buildings are traditionally identified through a tedious and inefficient process of manual, on-the-ground identification. Modern approaches evaluate a building's likelihood of occupancy through utility data -- whether the building has running water; whether it has an active account with a gas utility; etc. These modern approaches use machine learning by training a predictive model on existing utility and building vacancy data. To date, I have found no attempt to identify vacant and abandoned buildings through a visual machine learning approach.

I will attempt to train a binary classification model that intakes LiDAR imagery of Indianapolis, IN and classifies a building as either occupied or vacant. This is a preliminary step to see if this approach yields promising results for further development. Given my limited time, limited access to data, and limited expertise in this field, I set my threshold for a potentially useful approach at a model accuracy of +10% above the baseline model, and a sensitivity of +10% above the baseline model. If these goals are satisfied, I recommend training a more general model on cities accross the United States that will achieve this goal for any urban area in the country.

**Audience:** Local governments, Real estate developers

**Purpose:** Swiftly engage in clean up efforts for abandoned buildings, particularly in impoverished areas, to minimize crime and social drift.

**Background:**  
* 2003 Vacant housing in Indianapolis
    * https://hub.arcgis.com/datasets/IndyGIS::2003-vacant-housing-inventory

**Data:**  
*Abandoned and Vacant housing in Indianapolis*  
 * [Source](https://data.indy.gov/datasets/abandoned-and-vacant-housing?geometry=-86.313%2C39.748%2C-85.983%2C39.794)
 * 7,219 labeled vacant or abandoned buildings
 * As of June 2019
    
*LiDAR imagery*  
 * [Source](https://lidar.jinha.org/download.php?cname=marion&clon=-86.13305839196093&clat=39.779844384833936&years=2011,2016)  
 * [Metadata soure](https://www.dropbox.com/sh/ft35dwy9m5qe9f1/AACXW_W_DoWDiHeOUh00tAzja/2016%20Marion%20County?dl=0&subfolder_nav_tracking=1)

**Modeling:**
1. Unet Classifier
    * https://github.com/Esri/arcgis-python-api/blob/master/samples/04_gis_analysts_data_scientists/extracting_slums_from_satellite_imagery.ipynb

**Process Overview:**  
1. Overlay a map of abandoned or vacant homes on top of a satellite map of Indianapolis, IN.
2. Split the map into training and testing geographic regions.
3. Train a classification model (Unet) in a supervised learning process to identify abandoned buildings.
4. Test the classification model on the holdout dataset.
5. Evaluate model success on accuracy and ROC-AUC score.