FLAIR #1

Semantic segmentation and domain adaptation for land-cover from aerial imagery

Challenge proposed by the French National Institute of Geographical and Forest Information (IGN).

Participate in obtaining more accurate maps for a more comprehensive description and a better understanding of our environment! Come push the limits of state-of-the-art semantic segmentation approaches on a large and challenging dataset. Get in touch at 📧 flair@ign.fr

Links

Datapaper : https://arxiv.org/pdf/2211.12979.pdf
Dataset links : https://ignf.github.io/FLAIR/ or https://huggingface.co/datasets/IGNF/FLAIR
Challenge page : https://codalab.lisn.upsaclay.fr/competitions/8769 [🛑 closed!]

Context & Data

The FLAIR #1 dataset is sampled countrywide and is composed of over 20 billion annotated pixels, acquired over three years and different months (spatio-temporal domains). The dataset is available to download here. It consists of 512 x 512 patches with 13 (baselines) or 19 (full) semantic classes (see associated datapaper). Each patch has 5 channels (RVB-Infrared-Elevation).

ORTHO HR® aerial image cover of France (left), train and test spatial domains of the dataset (middle) and acquisition months defining temporal domains (right).

Example of input data (first three columns) and corresponding supervision masks (last column).

flair_data = {
1   : ['building','#db0e9a'] ,
2   : ['pervious surface','#938e7b'],
3   : ['impervious surface','#f80c00'],
4   : ['bare soil','#a97101'],
5   : ['water','#1553ae'],
6   : ['coniferous','#194a26'],
7   : ['deciduous','#46e483'],
8   : ['brushwood','#f3a60d'],
9   : ['vineyard','#660082'],
10  : ['herbaceous vegetation','#55ff00'],
11  : ['agricultural land','#fff30d'],
12  : ['plowed land','#e4df7c'],
13  : ['swimming_pool','#3de6eb'],
14  : ['snow','#ffffff'],
15  : ['clear cut','#8ab3a0'],
16  : ['mixed','#6b714f'],
17  : ['ligneous','#c5dc42'],
18  : ['greenhouse','#9999ff'],
19  : ['other','#000000'],
}

Baseline model

A U-Net architecture with a pre-trained ResNet34 encoder from the pytorch segmentation models library is used for the baselines. The used architecture allows integration of patch-wise metadata information and employs commonly used image data augmentation techniques. It has about 24.4M parameters and it is implemented using the segmentation-models-pytorch library. The results are evaluated with an Intersection Over Union (IoU) metric and a single mIoU is reported (see associated datapaper).

The metadata strategy refers encoding metadata with a shallow MLP and concatenate this encoded information to the U-Net encoder output. The augmentation strategy employs three typical geometrical augmentations (see associated datapaper).

Example of a semantic segmentation of an urban and coastal area in the D076 spatial domain, obtained with the baseline trained model:

Example of a semantic segmentation result using the baseline model.

Pre-trained models

Pre-trained models ⚡ with different modalities and architectures are available as a IGNF's HuggingFace collection here : huggingface.co/collections/IGNF/flair-models-landcover-semantic-segmentation
See datacards for more details about each model.

Lib usage

Installation 📌

# it's recommended to install on a conda virtual env
conda create -n my_env_name -c conda-forge python=3.11.6
conda activate my_env_name
git clone git@github.com:IGNF/FLAIR-1.git
cd FLAIR-1*
pip install -e .
# if torch.cuda.is_available() returns False, do the following :
# pip install torch==2.0.0 --extra-index-url=https://download.pytorch.org/whl/cu117

Tasks 🔎

This library comprises two main entry points:

📁 flair

The flair module is used for training, inference and metrics calculation at the patch level. To use this pipeline :

flair --conf=/my/conf/file.yaml

This will perform the tasks specified in the configuration file. If ‘train’ is enabled, it will train the model and save the trained model to the output folder. If ‘predict’ is enabled, it will load the trained model (or a specified checkpoint if ‘train’ is not enabled) and perform prediction on the test data. If ‘metrics’ is enabled, it will calculate the mean Intersection over Union (mIoU) and other IoU metrics for the predicted and ground truth masks. A toy dataset (reduced size) is available to check that your installation and the information in the configuration file are correct. Note: A notebook is available in the legacy-torch branch (which uses different libraries versions and structure) that was used during the challenge.

📁 zone_detect

This module aims to infer a pre-trained model at a larger scale than individual patches. It allows overlapping inferences using a margin argument. Specifically, this module expects a single georeferenced TIFF file as input.

flair-detect --conf=/my/conf/file-detect.yaml

Configuration for flair 📄

The pipeline is configured using a YAML file (flair-1-config.yaml). The configuration file includes sections for data paths, tasks, model configuration, hyperparameters and computational resources.

out_folder: The path to the output folder where the results will be saved.
out_model_name: The name of the output model.
train_csv: Path to the CSV file containing paths to image-mask pairs for training.
val_csv: Path to the CSV file containing paths to image-mask pairs for validation.
test_csv: Path to the CSV file containing paths to image-mask pairs for testing.
ckpt_model_path: The path to the checkpoint file of the model for prediction if train is disabled.

train: If set to True, the model will be trained.
train_load_ckpt: Initialize model with given weights.

predict: If set to True, predictions will be made using the model.
metrics: If set to True, metrics will be calculated.
delete_preds: Remove prediction files after metrics calculation.

model_architecture: The architecture of the model to be used (e.g., ‘unet’).
encoder_name: The name of the encoder to be used in the model (e.g., ‘resnet34’).
use_augmentation: If set to True, data augmentation will be applied during training.

use_metadata: If set to True, metadata will be used. If other than the FLAIR dataset, see structure to be provided.
path_metadata_aerial: The path to the aerial metadata JSON file.

channels: The channels opened in your input images. Images are opened with rasterio which starts at 1 for the first channel.
seed: The seed for random number generation to ensure reproducibility.

batch_size: The batch size for training.
learning_rate: The learning rate for training.
num_epochs: The number of epochs for training.

use_weights: If set to True, class weights will be used during training.
classes: Dict of semantic classes with value in images as key and list [weight, classname] as value. See config file for an example.

norm_type: Normalization to be applied: scaling (linear interpolation in the range [0,1]), custom (center-reduced with provided means and standard deviantions), without.
norm_means: If custom, means for each input band.
norm_stds: If custom standard deviation for each input band.

georeferencing_output: If set to True, the output will be georeferenced.

accelerator: The type of accelerator to use (‘gpu’ or ‘cpu’).
num_nodes: The number of nodes to use for training.
gpus_per_node: The number of GPUs to use per node for training.
strategy: The strategy to use for distributed training (‘auto’,‘ddp’,...).
num_workers: The number of workers to use for data loading.

cp_csv_and_conf_to_output: Makes a copy of paths csv and config file to the output directory.
enable_progress_bar: If set to True, a progress bar will be displayed during training and inference.
progress_rate: The rate at which progress will be displayed.

Configuration for zone_detect 📄

The pipeline is configured using a YAML file (flair-1-config-detect.yaml).

output_path: path to output result.
output_name: name of resulting raster.

input_img_path : path to georeferenced raster.
bands : bands to be used in your raster file.

img_pixels_detection : size in pixels of infered patches, default is 512.
margin : margin between patchs for overlapping detection. 128 by exemple means that every 128*resolution step, a patch center will be computed.
output_type : type of output, can be "class_prob" for integer between 0 and 255 representing the output of the model or "argmax" which will output only one band with the index of the class.
n_classes : number of classes.

model_weights : path to your model weights or checkpoint.
batch_size : size of batch in dataloader, default is 2.
use_gpu : boolean, rather use gpu or cpu for inference, default is true.
model_name : name of the model in pytorch segmentation models, default is 'unet'.
encoder_name : Name of the encoder from pytorch segmentation model, default is 'resnet34'.
num_worker : number of worker used by dataloader, value should not be set at a higher value than 2 for linux because paved detection can have concurrency issues compared with traditional detection and set to 0 for mac and windows (gdal implementation's problem).

write_dataframe : wether to write the dataframe of raster slicing to a file.

norm_type: Normalization to be applied: scaling (linear interpolation in the range [0,1]) or custom (center-reduced with provided means and standard deviantions).
norm_means: If custom, means for each input band.
norm_stds: If custom standard deviation for each input band.

Leaderboard

Model	mIoU
baseline U-Net (ResNet34)	0.5443±0.0014
baseline U-Net (ResNet34) + metadata + augmentation	0.5570±0.0027

If you want to submit a new entry, you can open a new issue. Results of the challenge will be reported soon!

The baseline U-Net with ResNet34 backbone obtains the following confusion matrix:

Baseline confusion matrix of the test dataset normalized by rows.

Reference

Please include a citation to the following article if you use the FLAIR #1 dataset:

@article{garioud2022flair1,
  doi = {10.13140/RG.2.2.30183.73128/1},
  url = {https://arxiv.org/pdf/2211.12979.pdf},
  author = {Garioud, Anatol and Peillet, Stéphane and Bookjans, Eva and Giordano, Sébastien and Wattrelos, Boris},
  title = {FLAIR #1: semantic segmentation and domain adaptation dataset},
  publisher = {arXiv},
  year = {2022}
}

Acknowledgment

This work was performed using HPC/AI resources from GENCI-IDRIS (Grant 2022-A0131013803).

Dataset license

The "OPEN LICENCE 2.0/LICENCE OUVERTE" is a license created by the French government specifically for the purpose of facilitating the dissemination of open data by public administration. If you are looking for an English version of this license, you can find it on the official GitHub page at the official github page.

As stated by the license :

Applicable legislation

This licence is governed by French law.

Compatibility of this licence

This licence has been designed to be compatible with any free licence that at least requires an acknowledgement of authorship, and specifically with the previous version of this licence as well as with the following licences: United Kingdom’s “Open Government Licence” (OGL), Creative Commons’ “Creative Commons Attribution” (CC-BY) and Open Knowledge Foundation’s “Open Data Commons Attribution” (ODC-BY).

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
configs		configs
csv_full		csv_full
csv_toy		csv_toy
images		images
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

License

IGNF/FLAIR-1

Folders and files

Latest commit

History

Repository files navigation