PyTorch implementation to learn both semantics and height from the following aerial images datasets: IEEE Data Fusion Contest 2018 (DFC2018) and ISPRS-Vaihingen.
We suggest using Python 3.7 or higher. To run this code and visualize plots and training intermediate results with Visdom, you will need the following packages (you may check file pre_install.sh for more details):
- PyTorch: Open source deep learning platform. Version 1.0 or higher
- Visdom: Framework for data visualization.
- Rasterio: Reads and writes files with geographic information (GeoTIFF and others) and provides a Python API based on Numpy :dimensional arrays and GeoJSON.
- Geopandas: Open source project to make working with geospatial data in python easier.
- Libiconv: Provides iconv for systems which don't have one.
- TQDM: Progress bar for Python.
Download/Fork/Clone this repository to your server or computer.
Download one or both datasets to start training/inference with this code. Our scripts expect the datasets to be placed in datasets.
- DFC 2018: collection of multi-source optical imagery over Houston, Texas. In particular, it contains Very High Resolution (VHR) color images resampled at 5cm / pixel, hyperspectral images and LiDAR-derived products such as DSMs and Digital Elevation Models (DEMs) at a resolution of 50cm / pixel.
- ISPRS-Vaihingen: comprises IRRG (Infra- Red, Red and Green) images at 9cm / pixel, DSM and 2D and 3D semantic labeled maps for urban classification and 3D reconstruction.
First, run an instance of Visdom and choose the desired port to publish results (8072 in our example):
$ visdom -port 8072
We've prepared several scripts to run our experiments in std_scripts. For example, to run a training routine on DFC2018, in the main root, run:
$ sh std_scripts/grss_dfc/grss_train.sh 0 8072 1 eweights
Code will expect a cuda device available with ID 0, Visdom running on 8072 and the multi-task learning method will be performed with uniform weighting. Check file to know standard parameters used.
Code (scripts) are released under the GPLv3 license for non-commercial and research purposes only. For commercial purposes, please contact the authors.