Data and code for the paper "Remote Sensing-Based Measurement of Living Environment Deprivation - Improving Classical Approaches with Machine Learning", by Dani Arribas-Bel, Jorge Patiño and Juanca Duque
Jupyter Notebook
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
code
data
.gitignore
.travis.yml
README.md
appveyor.yml

README.md

satellite_led_liverpool

This repository contains the data and code used in the paper "Remote Sensing-Based Measurement of Living Environment Deprivation - Improving Classical Approaches with Machine Learning", by Dani Arribas-Bel, Jorge Patiño and Juanca Duque, published in PLoS ONE in May 2017.

Citation

@article{10.1371/journal.pone.0176684,
    author = {Arribas-Bel, Daniel AND Patino, Jorge E. AND Duque, Juan C.},
    journal = {PLOS ONE},
    publisher = {Public Library of Science},
    title = {Remote sensing-based measurement of Living Environment Deprivation: Improving classical approaches with machine learning},
    year = {2017},
    month = {05},
    volume = {12},
    url = {https://doi.org/10.1371/journal.pone.0176684},
    pages = {1-25},
    abstract = {This paper provides evidence on the usefulness of very high spatial resolution (VHR) imagery in gathering socioeconomic information in urban settlements. We use land cover, spectral, structure and texture features extracted from a Google Earth image of Liverpool (UK) to evaluate their potential to predict Living Environment Deprivation at a small statistical area level. We also contribute to the methodological literature on the estimation of socioeconomic indices with remote-sensing data by introducing elements from modern machine learning. In addition to classical approaches such as Ordinary Least Squares (OLS) regression and a spatial lag model, we explore the potential of the Gradient Boost Regressor and Random Forests to improve predictive performance and accuracy. In addition to novel predicting methods, we also introduce tools for model interpretation and evaluation such as feature importance and partial dependence plots, or cross-validation. Our results show that Random Forest proved to be the best model with an R2 of around 0.54, followed by Gradient Boost Regressor with 0.5. Both the spatial lag model and the OLS fall behind with significantly lower performances of 0.43 and 0.3, respectively.},
    number = {5},
    doi = {10.1371/journal.pone.0176684}
}

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Computational environment

OS Status
Linux Build Status
Windows Build status