Resources for performing deep learning on satellite imagery
Branch: master
Clone or download
Latest commit c7d0b5f Feb 4, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.ipynb_checkpoints Update Apr 17, 2018
3d_models Create README.md Dec 29, 2018
change_detection/using_pca_and_k_means update readme Dec 20, 2018
data/images Adds pangeo Dec 21, 2018
land_classification Add folders Dec 10, 2018
object_detection Add folders Dec 10, 2018
pangeo Update Pangeo Dec 21, 2018
semantic_segmentation Add folders Dec 10, 2018
.gitignore Initial commit Apr 16, 2018
LICENSE Initial commit Apr 16, 2018
README.md Update README.md Feb 4, 2019

README.md

Introduction

This document primarily lists resources for performing deep learning (DL) on satellite imagery. To a lesser extent Machine learning (ML, e.g. random forests, stochastic gradient descent) are also discussed, as are classical image processing techniques.

Top links

Table of contents

Datasets

WorldView - SpaceNet

Sentinel

Landsat

Shuttle Radar Topography Mission (digital elevation maps)

Kaggle

Kaggle hosts several large satellite image datasets (> 1 GB). A list if general image datasets is here. A list of land-use datasets is here. The kaggle blog is an interesting read.

Kaggle - Amazon from space - classification challenge

Kaggle - DSTL - segmentation challenge

Kaggle - Airbus Ship Detection Challenge

Kaggle - Draper - place images in order of time

Kaggle - Deepsat - classification challenge

Not satellite but airborne imagery. Each sample image is 28x28 pixels and consists of 4 bands - red, green, blue and near infrared. The training and test labels are one-hot encoded 1x6 vectors. Each image patch is size normalized to 28x28 pixels. Data in .mat Matlab format. JPEG?

  • Imagery source
  • Sat4 500,000 image patches covering four broad land cover classes - barren land, trees, grassland and a class that consists of all land cover classes other than the above three Example notebook
  • Sat6 405,000 image patches each of size 28x28 and covering 6 landcover classes - barren land, trees, grassland, roads, buildings and water bodies.
  • Deep Gradient Boosted Learning article

Kaggle - other

Alternative datasets

There are a variety of datasets suitable for land classification problems.

UC Merced

AWS datasets

Quilt

  • Several people have uploaded datasets to Quilt

Google Earth Engine

Weather Datasets

Online computing resources

Generally a GPU is required for DL, and this section lists Jupyter environments with GPU available. There is a good overview of online Jupyter envs on the fast.at site.

Google Colab

  • Collaboratory notebooks with GPU as a backend for free for 12 hours at a time. Note that the GPU may be shared with other users, so if you aren't getting good performance try reloading.
  • Tensorflow available & pytorch can be installed, useful articles

Kaggle - also Google!

  • Free to use
  • GPU Kernels - may run for 1 hour
  • Tensorflow, pytorch & fast.ai available
  • Advantage that many datasets are already available
  • Read

Floydhub

### Clouderizer

  • https://clouderizer.com/
  • Clouderizer $5 month for 200 hours (Robbie plan)
  • Run projects locally, on cloud or both.
  • SSH terminal, Jupyter Notebooks and Tensorboard are securely accessible from Clouderizer Web Console.

Paperspace

Crestle

Interesting DL projects

Raster Vision by Azavea

RoboSat

RoboSat.Pink

DeepOSM

DeepNetsForEO - segmentation

Skynet-data

Production

Custom REST API

Tensorflow Serving

TensorFlow Serving makes it easy to deploy new algorithms and experiments, while keeping the same server architecture and APIs. Multiple models, or indeed multiple versions of the same model, can be served simultaneously. TensorFlow Serving comes with a scheduler that groups individual inference requests into batches for joint execution on a GPU

Floydhub

  • Allows exposing model via rest API

modeldepot

Image formats & catalogues

STAC - SpatioTemporal Asset Catalog

State of the art

What are companies doing?

Online platforms for Geo analysis

  • This article discusses some of the available platforms -> TLDR Pangeo rocks, but must BYO imagery
  • Pangeo - open source resources for parallel processing using Dask and Xarray http://pangeo.io/index.html
  • Airbus Sandbox -> will provide access to imagery
  • Descartes Labs -> access to EO imagery from a variety of providers via python API -> not clear which imagery is available (Airbus + others?) or pricing
  • DigitalGlobe have a cloud hosted Jupyter notebook platform called GBDX. Cloud hosting means they can guarantee the infrastructure supports their algorithms, and they appear to be close/closer to deploying DL. Tutorial notebooks here. Only Sentinel-2 and Landsat data on free tier.
  • Planet have a Jupyter notebook platform which can be deployed locally and requires an API key (14 days free). They have a python wrapper (2.7..) to their rest API. No price after 14 day trial.

Techniques

This section explores the different techniques (DL, ML & classical) people are applying to common problems in satellite imagery analysis. Classification problems are the most simply addressed via DL, object detection is harder, and cloud detection harder still (niche interest).

Land classification

Semantic segmentation

Change detection

Image registration

Object detection

Cloud detection

  • A subset of the object detection problem, but surprisingly challenging
  • From this article on sentinelhub there are three popular classical algorithms that detects thresholds in multiple bands in order to identify clouds. In the same article they propose using semantic segmentation combined with a CNN for a cloud classifier (excellent review paper here), but state that this requires too much compute resources.
  • This article compares a number of ML algorithms, random forests, stochastic gradient descent, support vector machines, Bayesian method.
  • DL..

Super resolution

Pansharpening

Stereo imaging for terrain mapping & DEMs

Lidar

NVDI - vegetation index

SAR

For fun

Useful open source software

Useful github repos

  • torchvision-enhance -> Enhance PyTorch vision for semantic segmentation, multi-channel images and TIF file,...
  • dl-satellite-docker -> docker files for geospatial analysis, including tensorflow, pytorch, gdal, xgboost...

Useful References