Table of Contents
This repository has the functionality to
- Query and download data from Sentinel provided by Copernicus API and Swisstopo provided by Swiss Federal Office of Topology.
- Query Open Street Map (OSM) to fetch map data.
- Plot and create segmentation and instance maps of geotagged tifs using OSM API.
- Split the tif files, which are 10,000x10,000 pixels for Swisstopo, into desired patches.
- Preprocess Swisstopo and Sentinel data together to match and align, convert CRS, etc.
- Unified data creation pipeline.
- Downsample or resize images.
We provide a single entry point download_data.py
for data download. For all possible options like maximum rows to query or ordering please call python download_data.py --help
.
Swisstopo data download is straightforward. For example the following query will return all images captured at 10cm ground sampling distance (other available option is 2m) from Switzerland from January 1st, 2022 to December 5th, 2022. Note that date_range argument must form a valid json.
python download_data.py --swisstopo --bbox "[5,46,10,48]" --date_range "[\"2018-01-01\", \"2018-12-31\"]" \
--resolution 0.1 --save_dir "../out"
One can also query for a single item using the following command. This is useful when the user wants a specific tile or downloads a csv containing list of files from the Swisstopo website.
python download_data.py --swisstopo --id ID --save_dir "../out"
10m, 20m, 30m bands from the Sentinel-2 can be downloaded via the following command. For an extensive list of possible commands call the program with --help
.
python download_data.py --sentinel --bbox "[5,46,10,48]" --date_range "[\"2022-01-01\", \"2022-01-05\"]" --save_dir "../out"
We use the OpenSearchAPI to query Sentinel data, please note that other options like Sentinelsat library and AWS are considered but didn't provide the flexibility we require.
See here on the dhusget.sh
script and here for API options.
We use the same dataset to train both pix2pixHD and Real-ESRGAN models and implemented a pipeline that
- queries and downloads Swisstopo and Sentinel data along a certain bounding box or point
- fixes alignment issues between two formats
- creates segmentation and instance maps for each tile
- patchifies the tiles (Swisstopo, Sentinel, segmentation map and instance map) into smaller (can be configured, by default 256) png's and saves the corresponding bounding box as a geojson.
CSV files with WGS-84 coordinates in the columns x_center, y_center
can be provided as input to create_imaginaire_dataset.py
in order to use Swisstopo tiles showing these specific locations.
We have prepared the following CSV files for demonstration:
csvs/supremap_swisstopo_zurich_lausanne_interlaken_mini.csv
: contains one point located in a Swisstopo tile showing ETH Zürich's Hauptgebäudecsvs/supremap_swisstopo_zurich_lausanne_interlaken_small.csv
: contains a selection of 5 points from Zurich, Lausanne and Interlakencsvs/supremap_swisstopo_zurich_lausanne_interlaken_large.csv
: contains 309 points across Zurich, Lausanne and Interlaken
Use the following commands to create corresponding datasets:
- Mini dataset:
python src/create_imaginaire_dataset.py --csv=csvs/supremap_swisstopo_zurich_lausanne_interlaken_mini.csv --output-dir=datasets/supremap_swisstopo_zurich_lausanne_interlaken_mini
- Small dataset:
python src/create_imaginaire_dataset.py --csv=csvs/supremap_swisstopo_zurich_lausanne_interlaken_small.csv --output-dir=datasets/supremap_swisstopo_zurich_lausanne_interlaken_small
- Large dataset (creation will take very long):
python src/create_imaginaire_dataset.py --csv=csvs/supremap_swisstopo_zurich_lausanne_interlaken_large.csv --output-dir=datasets/supremap_swisstopo_zurich_lausanne_interlaken_large
Please see create_imaginaire_dataset.py
for more details.
We perform the train-test split while creating the data. The total number of pixelwise instances of each class in our final dataset can be seen below:
- Note that we have 19177 beach pixels in the training set and 2459 in the validation set.
See some samples from out dataset in figures folder.
Please download required python packages via
pip install -r requirements.txt
Sentinel API requires authentication, follow the instructions here to sign up and add the following lines to your .bashrc
.
export DHUS_USER="YOUR_USERNAME"
export DHUS_PASSWORD="YOUR_PASSWORD"
export DHUS_URL="https://apihub.copernicus.eu/apihub"
If you have difficulty installing GDAL, please try to install through conda-forge.
conda install -c conda-forge gdal