
# PANGAEA Bench Hands-on Tutorial

## 🌟 What This Colab File Provides
This notebook serves as a quickstart guide for setting up and running experiments with PANGAEA's benchmarking framework. It includes:
1. **Environment Setup**: Clone the repository and install required dependencies.
2. **Data Loading**: Instructions on how to load datasets for geospatial tasks.
3. **Model Selection and Training**: Select a model and configure a pipeline for training.
4. **Running the Pipeline**: Complete an end-to-end experiment for evaluation.

Follow along to get started!

But before save your own copy (_File->Save a copy in Drive_) and
CHANGE THE RUNTIME TYPE to GPU

---

## 🎓 Designed by Valerio Marsocci

### Got questions or innovative ideas? Drop a message at Valerio.Marsocci@esa.int 

---

## Let's Start! 🛰️

---



## Clone the Repository

First clone the PANGAEA repository and navigate to the directory:

In [1]:
!git clone https://github.com/VMarsocci/pangaea-bench.git
%cd pangaea-bench

Cloning into 'pangaea-bench'...
remote: Enumerating objects: 4743, done.[K
remote: Counting objects: 100% (1370/1370), done.[K
remote: Compressing objects: 100% (437/437), done.[K
remote: Total 4743 (delta 1141), reused 933 (delta 933), pack-reused 3373 (from 2)[K
Receiving objects: 100% (4743/4743), 3.55 MiB | 19.12 MiB/s, done.
Resolving deltas: 100% (3220/3220), done.
/content/pangaea-bench


## Install Dependencies
   We will set up the environment using pip.

In [2]:
!pip install -r requirements.txt

Collecting rasterio (from -r requirements.txt (line 4))
  Downloading rasterio-1.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.1 kB)
Collecting rioxarray (from -r requirements.txt (line 13))
  Downloading rioxarray-0.19.0-py3-none-any.whl.metadata (5.5 kB)
Collecting ptflops (from -r requirements.txt (line 15))
  Downloading ptflops-0.7.4-py3-none-any.whl.metadata (9.4 kB)
Collecting pyDataverse (from -r requirements.txt (line 18))
  Downloading pydataverse-0.3.4-py3-none-any.whl.metadata (4.5 kB)
Collecting yacs (from -r requirements.txt (line 20))
  Downloading yacs-0.1.8-py3-none-any.whl.metadata (639 bytes)
Collecting hydra-core (from -r requirements.txt (line 22))
  Downloading hydra_core-1.3.2-py3-none-any.whl.metadata (5.5 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.1.0->-r requirements.txt (line 1))
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12

In [3]:
!pip install imagecodecs

Collecting imagecodecs
  Downloading imagecodecs-2025.3.30-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Downloading imagecodecs-2025.3.30-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (45.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.6/45.6 MB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: imagecodecs
Successfully installed imagecodecs-2025.3.30


In [4]:
# Probably you had to restart the runtime to correctly install packages, so let's change path again
%cd pangaea-bench
# Install the repository as a development package
!pip install --no-build-isolation --no-deps -e .

[Errno 2] No such file or directory: 'pangaea-bench'
/content/pangaea-bench
Obtaining file:///content/pangaea-bench
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: pangaea
  Running setup.py develop for pangaea
Successfully installed pangaea-1.0.0


## Adding a new dataset

We will implement a *toy* version of [FLAIR dataset](https://arxiv.org/pdf/2211.12979). The data are already stored on [Google Drive](https://drive.google.com/drive/folders/1uAJ7-m7stdeGenKQMLkRaEkKhR4uVbzU?usp=sharing)

In [5]:
#let's download the dataset from this link https://drive.google.com/file/d/1U8woGV2YlzeGSZKN2BPdz4gAySkEsmiJ/view?usp=drive_link

!gdown --id 1U8woGV2YlzeGSZKN2BPdz4gAySkEsmiJ --output toyFLAIR.zip

Downloading...
From (original): https://drive.google.com/uc?id=1U8woGV2YlzeGSZKN2BPdz4gAySkEsmiJ
From (redirected): https://drive.google.com/uc?id=1U8woGV2YlzeGSZKN2BPdz4gAySkEsmiJ&confirm=t&uuid=b73e95b2-1bf9-4005-a54b-067be6dfdc4d
To: /content/pangaea-bench/toyFLAIR.zip
100% 226M/226M [00:02<00:00, 92.8MB/s]


In [6]:
!unzip toyFLAIR.zip

Archive:  toyFLAIR.zip
   creating: toyFLAIR/
   creating: toyFLAIR/test/
   creating: toyFLAIR/test/images/
  inflating: toyFLAIR/test/images/IMG_061946.tif  
  inflating: toyFLAIR/test/images/IMG_062207.tif  
  inflating: toyFLAIR/test/images/IMG_062393.tif  
  inflating: toyFLAIR/test/images/IMG_062524.tif  
  inflating: toyFLAIR/test/images/IMG_062871.tif  
  inflating: toyFLAIR/test/images/IMG_063104.tif  
  inflating: toyFLAIR/test/images/IMG_063120.tif  
  inflating: toyFLAIR/test/images/IMG_063366.tif  
  inflating: toyFLAIR/test/images/IMG_063620.tif  
  inflating: toyFLAIR/test/images/IMG_063812.tif  
  inflating: toyFLAIR/test/images/IMG_064026.tif  
  inflating: toyFLAIR/test/images/IMG_064027.tif  
  inflating: toyFLAIR/test/images/IMG_064234.tif  
  inflating: toyFLAIR/test/images/IMG_064890.tif  
  inflating: toyFLAIR/test/images/IMG_065011.tif  
  inflating: toyFLAIR/test/images/IMG_065120.tif  
  inflating: toyFLAIR/test/images/IMG_065229.tif  
  inflating: toyFLAIR/te

The data structure is as follows:
```
toyFLAIR/
├── train/
│   ├── images/
│   └── labels/
├── val/
│   ├── images/
│   └── labels/
└── test/
    ├── images/
    └── labels/
```

How to add a new dataset? Follow [these instructions](https://github.com/VMarsocci/pangaea-bench/blob/main/.github/CONTRIBUTING.md#adding-a-new-downstream-dataset)

TL;DR:
you need to create two files:
 - a _config.yaml_ file and put it in _configs/dataset_
 - a _dataset.py_ file and put in _pangaea/dataset_


Follow this code for the _dataset.py_ file:

In [None]:
import torch
from pangaea.datasets.base import RawGeoFMDataset

class MyDataset(RawGeoFMDataset):
    def __init__(
        self,
        split: str,
        dataset_name: str,
        multi_modal: bool,
        multi_temporal: int,
        root_path: str,
        classes: list,
        num_classes: int,
        ignore_index: int,
        img_size: int,
        bands: dict[str, list[str]],
        distribution: list[int],
        data_mean: dict[str, list[str]],
        data_std: dict[str, list[str]],
        data_min: dict[str, list[str]],
        data_max: dict[str, list[str]],
        download_url: str,
        auto_download: bool,
        temp: int, #newly added parameter
    ):
        super(MyDataset, self).__init__(
            split=split,
            dataset_name=dataset_name,
            multi_modal=multi_modal,
            multi_temporal=multi_temporal,
            root_path=root_path,
            classes=classes,
            num_classes=num_classes,
            ignore_index=ignore_index,
            img_size=img_size,
            bands=bands,
            distribution=distribution,
            data_mean=data_mean,
            data_std=data_std,
            data_min=data_min,
            data_max=data_max,
            download_url=download_url,
            auto_download=auto_download,
        )

        self.temp = temp #newly added parameter
        # Initialize file lists or data structures here

    def __len__(self):
        # Return the total number of samples
        return len(self.file_list)

    def __getitem__(self, index):
        """Returns the i-th item of the dataset.

        Args:
            i (int): index of the item

        Raises:
            NotImplementedError: raise if the method is not implemented

        Returns:
            dict[str, torch.Tensor | dict[str, torch.Tensor]]: output dictionary follwing the format
            {"image":
                {
                "optical": torch.Tensor of shape (C T H W) (where T=1 if single-temporal dataset),
                    "sar": torch.Tensor of shape (C T H W) (where T=1 if single-temporal dataset),
                    },
            "target": torch.Tensor of shape (H W) of type torch.int64 for segmentation, torch.float for
            regression datasets.,
                "metadata": dict}.
        """
        # Load your data and labels here
        image = ...  # Load image
        target = ...  # Load target label or mask

        # Convert to tensors
        image = torch.tensor(image, dtype=torch.float32)
        target = torch.tensor(target, dtype=torch.long)

        return {
            'image': {'optical': image},
            'target': target,
            'metadata': {}
        }

    @staticmethod
    def download(self, silent=False):
        # Implement if your dataset requires downloading
        pass

Some other practical hints/reccomendetions:
 - name your _dataset_ file: ***toy_flair.py*** and put it in _configs/dataset_
 - name your _config_ file: ***toyflair.yaml*** and put it in _pangaea/datasets_
 - HINT! Get inspired by FiveBillionPixels files to follow the structure of the files. The data structure is the same!

Now you are ready to run the code on your newly added dataset. Run the following command line!

In [None]:
# Name of the config referring to the dataset
# Model: RemoteCLIP (lightweight, checkpoints downloaded automatically)
# Decoder: UperNet

!torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
  --config-name=train \
  dataset=toyflair \
  encoder=remoteclip \
  decoder=seg_upernet \
  preprocessing=seg_default \
  criterion=cross_entropy \
  task=segmentation \
  use_wandb=True \
  task.trainer.n_epochs=2

In [None]:
!torchrun pangaea/run.py --config-name=test ckpt_dir=/insert/your/results/directory #e.g. ckpt_dir=/content/pangaea-bench/20250716_143827_8ec411_remoteclip_seg_upernet_toyflair

In [None]:
# Name of the config referring to the dataset
# Model: RemoteCLIP (lightweight, checkpoints downloaded automatically)
# Decoder: UperNet
# let's train it for longer!

!torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
  --config-name=train \
  dataset=toyflair \
  encoder=remoteclip \
  decoder=seg_upernet \
  preprocessing=seg_default \
  criterion=cross_entropy \
  task=segmentation \
  use_wandb=True \
  task.trainer.n_epochs=40

In [None]:
!torchrun pangaea/run.py --config-name=test ckpt_dir=/insert/your/results/directory

In [None]:
# Name of the config referring to the dataset
# Model: RemoteCLIP (lightweight, checkpoints downloaded automatically)
# Decoder: UperNet
# but let's switch the loss to DICE loss!

!torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
  --config-name=train \
  dataset=toyflair \
  encoder=remoteclip \
  decoder=seg_upernet \
  preprocessing=seg_default \
  criterion=dice \
  task=segmentation \
  use_wandb=True \
  task.trainer.n_epochs=40

In [None]:
!torchrun pangaea/run.py --config-name=test ckpt_dir=/insert/your/results/directory

## Adding a new model

We will implement [FG-MAE](https://github.com/zhu-xlab/FGMAE). The checkpoints (ViT-S) are already stored on [Google Drive](https://drive.google.com/file/d/14EJgdtyIJ9Gk8A4ritoF6JPkeCZmCeS_/view?usp=sharing)

In [17]:
#let's download the dataset from this link https://drive.google.com/file/d/14EJgdtyIJ9Gk8A4ritoF6JPkeCZmCeS_/view?usp=drive_link

!gdown --id 14EJgdtyIJ9Gk8A4ritoF6JPkeCZmCeS_ --output pretrained_models/B13_vits16_fgmae_ep99.pth

Downloading...
From (original): https://drive.google.com/uc?id=14EJgdtyIJ9Gk8A4ritoF6JPkeCZmCeS_
From (redirected): https://drive.google.com/uc?id=14EJgdtyIJ9Gk8A4ritoF6JPkeCZmCeS_&confirm=t&uuid=6b3f0b2a-f0c6-4fd0-96ff-df1293b36116
To: /content/pangaea-bench/pretrained_models/B13_vits16_fgmae_ep99.pth
100% 585M/585M [00:11<00:00, 52.5MB/s]


How to add a new model? Follow [these instructions](https://github.com/VMarsocci/pangaea-bench/blob/main/.github/CONTRIBUTING.md#adding-a-new-geospatial-foundation-model)

TL;DR:
you need to create two files:
 - a _config.yaml_ file and put it in _configs/encoder_
 - a _model.py_ file and put in _pangaea/encoders_


Follow this code for the _model.py_ file:

In [None]:
import torch.nn as nn

from pangaea.encoders.base import Encoder

class MyModel(Encoder):
    def __init__(
        self,
        encoder_weights: str | Path,
        input_size: int,
        input_bands: dict[str, list[str]],
        output_layers: int | list[int],
        in_chans: int,              #newly added parameter
    ) -> None:
        super().__init__(
            model_name="my_model_name",
            encoder_weights=encoder_weights,
            input_bands=input_bands,
            input_size=input_size,
            embed_dim=768,        # my_model_embed_dim, fixed parameters
            output_dim=768,       # my_model_output_dim, fixed parameters
            multi_temporal=False, # wether support multi-temporal, fixed parametersfixed parameters
            multi_temporal_ouput=False, # wether the output of the model has a temporal dimension
        )

        self.in_chans = in_chans    #newly added parameter

        # Initialize your model architecture here
        # For example:
        self.backbone = nn.Sequential(
            nn.Conv2d(in_chans, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            # Add more layers as needed
        )
        # Specify output layers if applicable

    def load_encoder_weights(self, pretrained_path: str) -> None:
        # Load pretrained weights
        state_dict = torch.load(pretrained_path, map_location='cpu')
        self.load_state_dict(state_dict, strict=False)
        print(f"Loaded encoder weights from {pretrained_path}")

    def forward(self, x: dict[str, torch.Tensor]) -> list[torch.Tensor]:
        """Foward pass of the encoder.

        Args:
            x (dict[str, torch.Tensor]): encoder's input structured as a dictionary:
            x = {modality1: tensor1, modality2: tensor2, ...}, e.g. x = {"optical": tensor1, "sar": tensor2}.
            If the encoder is multi-temporal (self.multi_temporal==True), input tensor shape is (B C T H W) with C the
            number of bands required by the encoder for the given modality and T the number of time steps. If the
            encoder is not multi-temporal, input tensor shape is (B C H W) with C the number of bands required by the
            encoder for the given modality.

        Returns:
            list[torch.Tensor]: list of the embeddings for each modality. For single-temporal encoders, the list's
            elements are of shape (B, embed_dim, H', W'). For multi-temporal encoders, the list's elements are of shape
            (B, C', T, H', W') with T the number of time steps if the encoder does not have any time-merging strategy,
            else (B, C', H', W') if the encoder has a time-merging strategy (where C'==self.output_dim).
        """
        x = image['optical']
        outputs = []
        # Forward pass through the model
        for idx, layer in enumerate(self.backbone):
            x = layer(x)
            if idx in self.output_layers:
                outputs.append(x)
        return outputs

Some other practical hints/reccomendetions:
 - name your _model_ file: ***fgmae_encoder.py*** and put it in _model/encoders_
 - name your _config_ file: ***fgmae.yaml*** and put it in _configs/encoder_
 - HINT! Get inspired by SSL4EO_MAE files to follow the structure of the files. The  structure is the same! Also this [link](https://github.com/zhu-xlab/FGMAE/blob/main/src/transfer_classification/models/mae/models_vit.py) can be useful!

Now you are ready to run the code on your newly added model. Run the following command line!

In [None]:
# Use torchrun to launch training
# Dataset: toyflair (already added)
# Encoder: FG-MAE model
# Decoder: UPerNet
# Preprocessing: default settings
# Criterion: cross entropy loss
# Task: segmentation
!torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
  --config-name=train \
  dataset=toyflair \
  encoder=fgmae \
  decoder=seg_upernet \
  preprocessing=seg_default \
  criterion=cross_entropy \
  task=segmentation \
  use_wandb=True \
  task.trainer.n_epochs=40

In [None]:
!torchrun pangaea/run.py --config-name=test ckpt_dir=/insert/your/results/directory