Skip to content
Replication code for "Supervised Classification of Built-up Areas in Sub-Saharan African Cities using Landsat Imagery and OpenStreetMap"
Jupyter Notebook Python JavaScript CSS HTML
Branch: master
Clone or download
Latest commit 61c24d6 Jun 14, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.



This repository contains all the code required to reproduce the results presented in the following paper:

  • Y. Forget, C. Linard, M. Gilbert. Supervised Classification of Built-up Areas in Sub-Saharan African Cities using Landsat Imagery and OpenStreetMap, 2018.

The results of the study can be explored here in an interactive map.

Input, intermediary and output data can be downloaded from zenodo.


Dependencies are listed in the environment.yml file at the root of the repository. Using the Anaconda distribution, a virtual environment containing all the required dependencies can be created automatically:

# Clone the repository
git clone
cd builtup-classification-osm

# Create the Python environment
conda env create --file environment.yml

# Activate the environment
source activate landsat-osm
# Or, depending on the system:
conda activate landsat-osm


Due to storage constraints, input data are not integrated to this repository. However, input and intermediary files required to run the analysis can be downloaded from a zenodo deposit. Alternatively, output files of the study can be directly downloaded from this repository. To run the following code, input and intermediary files must be downloaded in the /data folder. For example, in Linux:

# Create the data directory
cd builtup-classification-osm
mkdir data
cd data

# Download input and intermediary data
wget -O
wget -O

# Decompress the archives
rm *.zip

Likewise, the Global Humans Settlements Layer is required to run the notebook 02-External_Datasets.ipynb:

cd builtup-classification-osm/data/input
mv GHS_BUILT_LDSMT_GLOBE_R2015B_3857_38_v1_0 ghsl


The code of the analysis in divided in two parts: the Python scripts and modules used to support the analysis, and the notebooks where the outputs of the analysis have been produced.


  • notebooks/01-Evolution_of_OSM.ipynb : Analysis of OSM data availability and its evolution from 2011 to 2018. Please note that this analysis requires additionnal files and softwares: the full historic OSM data dump (available through the website for OpenStreetMap members, and the osmium command-line tool).
  • notebooks/02-External_Datasets.ipynb : Assessment of the GHSL and HBASE datasets in the context of our case studies.
  • notebooks/03-Buildings_Footprints.ipynb : Assessment of OSM buildings footprints as built-up training samples.
  • notebooks/04-Nonbuilt_Tags.ipynb : Assessment of OSM non-built polygons (leisure, natural or landuse objects) as non-built-up training samples.
  • notebooks/05-Urban_Blocks.ipynb : Assessment of OSM-based urban blocks as built-up training samples.
  • notebooks/06-Urban_Distance.ipynb : Assessment of OSM-based urban distance as non-built-up training samples.
  • notebooks/07-Comparative_Analysis.ipynb : Supervised classification of built-up areas.

To run the notebooks, input and intermediary data must have been downloaded (see above). Additionnaly, the following pre-processing scripts must be executed:

python src/
python src/

The outputs of the other pre-processing scripts are already included in the intermediary zenodo archive.


  • src/ : script used for the acquisition of OSM data.
  • src/ : script used to generate 40km x 40km area of interest for each case study.
  • src/ : script used to mask input landsat data according to our areas of interest.
  • src/ : script used to produce the intermediary OSM products: rasterized buildings and non-built objects, urban blocks, urban distance, water mask...
  • src/ : script used to rasterize the reference polygons used as validation data.


  • src/ : used to access metadata specific to each case study.
  • src/ : processing of Landsat data.
  • src/ : raster processing functions.
  • src/ : supervised classification and performance assessment.
You can’t perform that action at this time.