Impact Evaluation with Machine Learning and High-Resolution Satellite Images
The elimination of poverty worldwide is the No. 1 UN Sustainable Development Goal for 2030. To achieve this, we need more rigorous evaluations to know what anti-poverty program works and what doesn't. However, traditional data collection methods in developing countries (for example, household surveys) tend to be very expensive. Each evaluation can cost up to 0.4-3 million USD (How Much Will an Impact Evaluation Cost?).
Promise of Satellite Imagery and Machine Learning
High-resolution satellite images and machine learning offer great promise for cheaper evaluations. Satellite images contain extremely rich information about households' economic status: quality of their housing, asset ownership such as cars and barns, agricultural productivity on their lands, local infrastructure quality, and so on. With the state-of-the-art machine learning models to help process large image data, these can serve as objective and reliable economic measures. With additional econometric analysis leveraging either experimental or quasi-experimental variations, economists can evaluate programs at a fraction of the cost of traditional methods.
The setting for this project is rural Kenya, where poorer people live in houses that they built themselves with a thatched roof (made with dry vegetation such as straw and palm branches). These roofs are not very durable, and need to be replaced frequently. Richer people tend to live in houses with more durable and high-quality metal roofs. Taking roof quality as a proxy for economic well-being, this project uses a machine learning model, DeepLabV3+, to identify buildings with metal roofs in high-resolution satellite images taken over these Kenyan villages.
Figure 1. Visualization of segmentation masks for high-quality (metal) roof on high-resolution satellite images in Kenyan villages.
To validate these measures, I test whether a large, randomized unconditional cash transfer (GiveDirectly) improved roof quality in rural Kenya. The answer is yes! Preliminary results (see Figure 2) show that program effects estimated from satellite-derived outcomes are consistent with those from traditional methods (household surveys).
Figure 2. Distribution of the proportion of pixels that are covered by high-quality (metal) roof for each household. Consistent with the survey findings, those who receive a large cash transfer (Treatment) live in houses with higher roof quality, compared to those who do not (Control). This difference is statistically significant.
These results are preliminary and incomplete. Please do not cite or distribute. For a more comprehensive description of the randomized controlled trial and the institutional contexts, see Michael Walker's Job Market Paper.
This project is just a very first step towards pursuing a broader research agenda, which seeks to leverage technological advances in data science to improve effectiveness of aid projects in developing countries.
This repo is forked from pytorch-deeplab-xception, which is beautifully written and a PyTorch implementation of the original model (authors' implementation in Tensorflow). This repo also makes use of the official SpaneNet utilities (these codes are in the
spacenetutilities/ folder). I preprocessed the spatial datasets, added dataloaders to connect them with the model, and trained the model on a supercomputing facility. I also conducted some preliminary econometric analyses and data visualizations.
Building footprint segmentation is not one of the benchmarks in the original DeepLab paper series, so I fetched the SpaceNet (Round 2 Khartoum, as this is closest to a Kenyan setting) dataset to pre-train a DeepLabV3+ model (because the annotated Google Earth image dataset is small). I then fine-tune the model on a set of Google Earth Images from the Kenyan villages that have been enrolled into the GiveDirectly experiment. All the metal roof buildings in the Google Earth images have been annotated, and the model produces predicted segmentation masks, as shown in Figure 1.
To run these codes (these codes are set up to be run on a supercomputing facility with
Slurm support, hence the headers in the bash scripts in
scripts/), remember to set up the correct local path in
- Download SpaceNet Data (Round 2), particularly for Khartoum (with
aws s3 cp s3://spacenet-dataset/AOI_5_Khartoum/AOI_5_Khartoum_Train.tar.gz . tar -xf AOI_5_Khartoum_Train.tar.gz
- Preprocess using the SpaceNetUtilities (with
slurm sbatch), fixing a few bugs from the v3 branch in the official repo
cd scripts sbatch preprocess.sh
- Train the model with SpaceNet first, and then fine-tune on Google Earth (modifying command-line arguments in
train_mobilenet.sh) (CUDA support required)
- Tensorboard is supported, launch tensorboard during training
- Visualize a subset of images and masks (replicate Figure 1) (CUDA support required)
- Generate predictions for all images (CUDA support required)
- Merge model predictions with existing dataset
cd ../regress python merge.py
- Subsequent econometric analysis and visualization of results are done in
regress.R(replicate Figure 2)