# Exploring and Downloading Datasets and Models

Let's start by exploring the repository of datasets and models. 

You can do that at the different accessibility layers of EOTDL: the user interface, the API, the command line interface (CLI) and the Python library.

## The User Interface

The easiest way to get started with EOTDL is by exploring the user interface: [https://eotdl.com/](https://www.eotdl.com/). Through the UI you will be able to:

- Explore the datasets and models available in the repository (filtering by name, tags and liked)
- Edit your own datasets and models information.
- Read the tutorials on the blog.
- Read the documentation.
- Find useful links to other resources (GitHub, Discord, ...)

![web](images/web2.png)

## The Command Line Interface

Even though the UI is the easiest way to get started, it is not the most convenient for actually working with the datasets and models. For that we recommend installing the CLI.

If you are running this notebook locally, consider creating a virtual environment before installing the CLI to avoid conflicts with other packages.

With [uv](https://docs.astral.sh/uv/):

```
uv init
uv add eotdl
```

In [1]:
# uncomment to install

# !uv add eotdl

Once installed, you can execute the CLI with different commands. 

In [2]:
!eotdl --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl [OPTIONS] COMMAND [ARGS]...[0m[1m                                      [0m[1m [0m
[1m                                                                                [0m
 Welcome to EOTDL. Learn more at https://www.eotdl.com/                         
                                                                                
                                                                                
[2m╭─[0m[2m Options [0m[2m───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-install[0m[1;36m-completion[0m          Install completion for the current shell.      [2m│[0m
[2m│[0m [1;36m-[0m[1;36m-show[0m[1;36m-completion[0m             Show completion for the current shell, to copy [2m│[0m
[2m│[0m                               it or customize the installation.         

In [3]:
!eotdl version

EOTDL Version: 2025.05.26-4


In [4]:
!eotdl datasets --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl datasets [OPTIONS] COMMAND [ARGS]...[0m[1m                             [0m[1m [0m
[1m                                                                                [0m
 Explore, ingest and download training datasets.                                
                                                                                
                                                                                
[2m╭─[0m[2m Options [0m[2m───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-help[0m          Show this message and exit.                                  [2m│[0m
[2m╰──────────────────────────────────────────────────────────────────────────────╯[0m
[2m╭─[0m[2m Commands [0m[2m──────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36minge

You can explore datasets with the following command:

In [5]:
!eotdl datasets list 

['HyperspectralSimForS2-waters', 'SatellogicDataset', 'MassachusettsRoadsS2', 'EuroCropsCloudNative', 'MSC-France', 'ESAWAAI', 'JPL-CH4-detection', 'HYPERVIEW2', 'PASTIS-HD', 'xView2', 'crop-type-mapping-south-sudan', 'Five-Billion-Pixels', 'CROPGRIDS', 'DynamicEarthNet', 'sen1floods11', 'SpaceNet7', 'ai4smallfarms', 'HLS-Burn-Scars', 'MADOS-Marine-Debris-Oil-Spill', 'SeeingBeyondTheVisible', 'OrbitalAI', 'IMAGINe', 'EnhancedS2Agriculture', 'AirQualityAndHealth', 'AI4Sen2Cor-Datasets', 'EuroSAT-Q1-small', 'UrbanSARFloods', 'MMFlood', 'Sen1Floods11', 'ship-segmentation-dataset', 'Sentinel-2-Ships', 'CloudSEN12', 'TAIGA', 'GlobalInventorySolarPhotovoltaic', 'AirbusShipDetection', 'xview3', 'ai4arctic-sea-ice-challenge-raw', 'ai4arctic-sea-ice-challenge-ready-to-train', 'AERONET', 'EuroSAT-RGB-small', 'Boadella-PhiLab24', 'SEN12MS-CR', 'DeepGlobeRoadExtraction', 'MassachusettsRoadsDataset', 'OpenEarthMap', 'ESA-Worldcover', 'AlignSAR-Groningen-Sentinel1-Q0', 'AI4EO-MapYourCity', 'Enhanced

In [6]:
!eotdl datasets list --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl datasets list [OPTIONS][0m[1m                                          [0m[1m [0m
[1m                                                                                [0m
 Retrieve a list with all the datasets in the EOTDL.                            
                                                                                
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-name[0m[2m, it will filter the results by name. If no name is provided, [0m  
 [2mit will return all the datasets.[0m                                               
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-limit[0m[2m, it will limit the number of results. If no limit is [0m         
 [2mprovided, it will return all the datasets.[0m                                     
                                                                                
 [2mExamples[0m              

In [7]:
!eotdl datasets list -n eurosat

['EuroSAT-Q1-small', 'EuroSAT-RGB-small', 'EuroSAT-RGB', 'EuroSAT-RGB-Q2', 'EuroSAT-RGB-STAC', 'EuroSAT']


As you may have guessed, you can stage a dataset with the following command:

In [8]:
!eotdl datasets get EuroSAT-RGB -v 1

Data available at /home/juan/.cache/eotdl/datasets/EuroSAT-RGB


The first time you run the command, you will be asked to login (which will require you to create an account if you haven't already). You can also login with the command

In [12]:
!eotdl auth login

On your computer or mobile device navigate to:  https://earthpulse.eu.auth0.com/activate?user_code=DPNC-BCBB
Authenticated!
- Id Token: eyJhbGciOi...
Saved credentials to:  /home/juan/.cache/eotdl/creds.json
You are logged in as it@earthpulse.es


In [13]:
!eotdl auth --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl auth [OPTIONS] COMMAND [ARGS]...[0m[1m                                 [0m[1m [0m
[1m                                                                                [0m
 Login to EOTDL.                                                                
                                                                                
                                                                                
[2m╭─[0m[2m Options [0m[2m───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-help[0m          Show this message and exit.                                  [2m│[0m
[2m╰──────────────────────────────────────────────────────────────────────────────╯[0m
[2m╭─[0m[2m Commands [0m[2m──────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36mlogi

In [14]:
!eotdl datasets get --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl datasets get [OPTIONS] [DATASET][0m[1m                                 [0m[1m [0m
[1m                                                                                [0m
 Download a dataset from the EOTDL.                                             
                                                                                
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-path[0m[2m, it will download the dataset to the specified path. If no [0m    
 [2mpath is provided, it will download to ~/.eotdl/datasets.[0m                       
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-version[0m[2m, it will download the specified version. If no version is [0m  
 [2mprovided, it will download the latest version.[0m                                 
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-assets[0m[2m when the dataset is STAC, it will also download the STA

By default, datasets will be staged to your `$HOME/.cache/eotdl/datasets` folder or the path in the `EOTDL_DOWNLOAD_PATH` environment variable. You can change this with the `--path` argument.

In [15]:
!eotdl datasets get EuroSAT-RGB -v 1 -p data

Data available at data/EuroSAT-RGB


You can choose a particular version to download with the `--version` argument. If you don't specify a version, the latest version will be downloaded.

In [16]:
!eotdl datasets get EuroSAT-RGB -p data -v 1

Dataset `EuroSAT-RGB` already exists at data/EuroSAT-RGB. To force download, use force=True or -f in the CLI.


If you try to re-stage a datasets, the CLI will complain. You can force a re-download with the `--force` argument.

In [23]:
!eotdl datasets get EuroSAT-RGB -p data -v 1 -f

Data available at data/EuroSAT-RGB


By default, the `get` command will only download the dataset metadata.

In [24]:
!ls data/EuroSAT-RGB

catalog.v1.parquet  README.md


The `README.md` file contains some basic information about the dataset that we display in the UI (authors, name, licens, description, ...)

In [25]:
!cat data/EuroSAT-RGB/README.md

---
name: EuroSAT-RGB
license: open
source: http://km.com
thumbnail: https://images.unsplash.com/photo-1576158113421-5484e37d43f9?q=80&w=2080&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D
authors:
  - juan
---
EuroSAT: A land use and land cover classification dataset based on Sentinel-2 satellite images.

https://arxiv.org/abs/1709.00029

Land use and land cover classification using Sentinel-2 satellite images.The Sentinel-2 satellite images are openly and freely accessible provided in the Earth observation program Copernicus. We present a novel dataset based on Sentinel-2 satellite images covering 13 spectral bands and consisting out of 10 classes with in total 27,000 labeled and geo-referenced images.


And `catalog.v[version].parquet` contains the STAC metadata for the datasets, including the links to the raw data (assets).

In [30]:
import geopandas as gpd

catalog = gpd.read_parquet('data/EuroSAT-RGB/catalog.v1.parquet')

catalog

Unnamed: 0,type,stac_version,stac_extensions,datetime,id,bbox,geometry,assets,links,repository
0,Feature,1.0.0,[],2025-03-25 13:03:24.765930,EuroSAT-RGB.zip,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'checksum': '632e9e4394c518a1d7d913...,[],eotdl


In [32]:
catalog.assets.values

array([{'asset': {'checksum': '632e9e4394c518a1d7d9137e569e3655ecc12051', 'href': 'https://api.eotdl.com/datasets/654515c5b6491c0a686d256d/stage/EuroSAT-RGB.zip', 'size': 94658966, 'timestamp': '2023-11-03T16:11:05.027000'}}],
      dtype=object)

It is possible to stage data from the list of assets in the catalog using the library (see next section), but you probably want to stage all data at once. To do so, add the `--assets` argument to the `get` command using the CLI.

In [33]:
!eotdl datasets get EuroSAT-RGB -p data -v 1 -f -a

Staging assets: 100%|█████████████████████████████| 1/1 [00:01<00:00,  1.75s/it]
Data available at data/EuroSAT-RGB


In [35]:
!ls data/EuroSAT-RGB

catalog.v1.parquet  EuroSAT-RGB.zip  README.md


Working with models is very much the same at this point.

In [36]:
!eotdl models --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl models [OPTIONS] COMMAND [ARGS]...[0m[1m                               [0m[1m [0m
[1m                                                                                [0m
 Explore, ingest and download ML models.                                        
                                                                                
                                                                                
[2m╭─[0m[2m Options [0m[2m───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-help[0m          Show this message and exit.                                  [2m│[0m
[2m╰──────────────────────────────────────────────────────────────────────────────╯[0m
[2m╭─[0m[2m Commands [0m[2m──────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36minge

In [37]:
!eotdl models list

['SuperResolutionUseCase', 'SCANEO', 'EuroCropsModel', 'MassachusettsRoadsS2Model', 'SSL4EO-S12', 'SpectralGPT', 'Scale-MAE', 'RemoteCLIP', 'Prithvi', 'GFM-Swin', 'DOFA', 'CROMA', 'CloudSEN12L2A', 'RoadSegmentationQ2', 'TropicalCyclonesWindSpeed', 'BigEarthNet-S2-Resnet50', 'BigEarthNet-S2-Resnet34', 'BigEarthNet-S1-Resnet50', 'BigEarthNet-S1-Resnet34', 'BigEarthNet-S1S2-Resnet50', 'BigEarthNet-S1S2-Resnet34', 'EuroSAT-RGB-Q2', 'EuroSAT-RGB', 'forest-map', 'EuroSAT-RGB-PhiLab24', 'RoadSegmentation', 'EuroSAT-RGB-BiDS23-Q1', 'MAVERICC', 'WALDO25', 'EuroSAT-RGB-BiDS23']


In [38]:
!eotdl models list --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl models list [OPTIONS][0m[1m                                            [0m[1m [0m
[1m                                                                                [0m
 Retrieve a list with all the models in the EOTDL.                              
                                                                                
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-name[0m[2m, it will filter the results by name. If no name is provided, [0m  
 [2mit will return all the models.[0m                                                 
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-limit[0m[2m, it will limit the number of results. If no limit is [0m         
 [2mprovided, it will return all the models.[0m                                       
                                                                                
 [2mExamples[0m              

In [41]:
!eotdl models get RoadSegmentation -p data -a -f

Staging assets: 100%|█████████████████████████████| 2/2 [00:01<00:00,  1.22it/s]
Data available at data/RoadSegmentation


In [42]:
!eotdl models get --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1meotdl models get [OPTIONS] [MODEL][0m[1m                                     [0m[1m [0m
[1m                                                                                [0m
 Download a model from the EOTDL.                                               
                                                                                
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-path[0m[2m, it will download the model to the specified path. If no path [0m 
 [2mis provided, it will download to ~/.eotdl/models.[0m                              
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-version[0m[2m, it will download the specified version. If no version is [0m  
 [2mprovided, it will download the latest version.[0m                                 
 [2mIf using [0m[1;2;36m-[0m[1;2;36m-assets[0m[2m when the model is STAC, it will also download the STAC 

We will explore how to ingest datasets and models in the next tutorials.

## The Library

Everything that we have done so far with the CLI is also enabled through the Python library. When installing the CLI, the library is automatically installed as well.

In [43]:
import eotdl

eotdl.__version__

'2025.05.26-4'

In [44]:
from eotdl.datasets import retrieve_datasets

datasets = retrieve_datasets()
len(datasets)

90

In [45]:
retrieve_datasets("eurosat")

['EuroSAT-Q1-small',
 'EuroSAT-RGB-small',
 'EuroSAT-RGB',
 'EuroSAT-RGB-Q2',
 'EuroSAT-RGB-STAC',
 'EuroSAT']

With the library, you have full control over the datasets and models.

In [46]:
[d for d in datasets if "eurosat" in d.lower()]

['EuroSAT-Q1-small',
 'EuroSAT-RGB-small',
 'EuroSAT-RGB',
 'EuroSAT-RGB-Q2',
 'EuroSAT-RGB-STAC',
 'EuroSAT']

You can stage datasets as well, but now you will have to manage potential errors.

In [48]:
from eotdl.datasets import stage_dataset

stage_dataset("EuroSAT-RGB")

Exception: Dataset `EuroSAT-RGB` already exists at /home/juan/.cache/eotdl/datasets/EuroSAT-RGB. To force download, use force=True or -f in the CLI.

In [49]:
stage_dataset("EuroSAT-RGB", version=1, force=True)

'/home/juan/.cache/eotdl/datasets/EuroSAT-RGB'

It is possible to stage particular assets by retrieving their links from the catalog.

In [60]:
path = stage_dataset("EuroSAT-RGB", force=True, path='data', assets=True)

catalog = gpd.read_parquet(f'{path}/catalog.v1.parquet')

catalog

Staging assets: 100%|██████████| 3/3 [00:02<00:00,  1.31it/s]


Unnamed: 0,type,stac_version,stac_extensions,datetime,id,bbox,geometry,assets,links,repository
0,Feature,1.0.0,[],2025-03-25 13:03:24.765930,EuroSAT-RGB.zip,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'checksum': '632e9e4394c518a1d7d913...,[],eotdl


In [65]:
from eotdl.datasets import stage_dataset_file

for _, row in catalog.iterrows():
    if row['id'] == 'EuroSAT-RGB.zip':
        print(row['assets']['asset'])
        stage_dataset_file(row['assets']['asset']['href'], 'data')

{'checksum': '632e9e4394c518a1d7d9137e569e3655ecc12051', 'href': 'https://api.eotdl.com/datasets/654515c5b6491c0a686d256d/stage/EuroSAT-RGB.zip', 'size': 94658966, 'timestamp': '2023-11-03T16:11:05.027000'}


In [66]:
!ls data

EuroSAT-RGB  EuroSAT-RGB.zip  RoadSegmentation


In fact, the CLI is built on top of the library.

And the same for the models

In [52]:
from eotdl.models import retrieve_models

retrieve_models()

['SuperResolutionUseCase',
 'SCANEO',
 'EuroCropsModel',
 'MassachusettsRoadsS2Model',
 'SSL4EO-S12',
 'SpectralGPT',
 'Scale-MAE',
 'RemoteCLIP',
 'Prithvi',
 'GFM-Swin',
 'DOFA',
 'CROMA',
 'CloudSEN12L2A',
 'RoadSegmentationQ2',
 'TropicalCyclonesWindSpeed',
 'BigEarthNet-S2-Resnet50',
 'BigEarthNet-S2-Resnet34',
 'BigEarthNet-S1-Resnet50',
 'BigEarthNet-S1-Resnet34',
 'BigEarthNet-S1S2-Resnet50',
 'BigEarthNet-S1S2-Resnet34',
 'EuroSAT-RGB-Q2',
 'EuroSAT-RGB',
 'forest-map',
 'EuroSAT-RGB-PhiLab24',
 'RoadSegmentation',
 'EuroSAT-RGB-BiDS23-Q1',
 'MAVERICC',
 'WALDO25',
 'EuroSAT-RGB-BiDS23']

In [58]:
from eotdl.models import stage_model 

path = stage_model("RoadSegmentation", force=True, path='data', assets=True)
path

Staging assets: 100%|██████████| 2/2 [00:01<00:00,  1.40it/s]


'data/RoadSegmentation'

In [59]:
import os 

os.listdir(path)

['README.md', 'model.onnx', 'catalog.v4.parquet']

## Discussion and Contribution opportunities

Feel free to ask questions now (live or through Discord) and make suggestions for future improvements.


- What features would like to see for exploration and downloading?