[![Radiant MLHub Logo](https://radiant-assets.s3-us-west-2.amazonaws.com/PrimaryRadiantMLHubLogo.png)](https://mlhub.earth/)

# Models

This notebook will walk you through a few common techniques for working with Radiant MLHub model, including:

* Listing available models
* Fetching a model
* Fetching assets and collections associated with a model

A **Model** in the Radiant MLHub API is a STAC item implementing the
[ML Model extension](https://github.com/stac-extensions/ml-model).
The goal of the STAC ML Model Extension is to provide a way of cataloging 
machine learning (ML) models that operate on Earth observation (EO) data 
described as a STAC catalog.


### Import Libraries

In [1]:
from pprint import pprint
import itertools as it

from radiant_mlhub import MLModel, client

## `MLModel` Class

Using the `radiant_mlhub.MLModel` class is the recommended method for working with models as Python objects (see [Low-Level Client](#Low-Level-Client) docs below for how to work with raw API responses). The `MLModel` class has some convenient methods for listing and fetching models, as well as fetching the assets associated with those models.

### List Models

You can use the `MLModel.list` class method to list all models available through the Radiant MLHub API. Each instance has `id` and `title` attributes that you can inspect to get more information about the model, and an `assets` property that you can use to get the links associated with the models.

In [2]:
models = MLModel.list()

# List the title and ID of first 10 datasets returned by the API
for model in models:
    print(f'{dataset.title} ({dataset.id})')

BigEarthNet (bigearthnet_v1)
Chesapeake Land Cover (microsoft_chesapeake)
CV4A Kenya Crop Type Competition (ref_african_crops_kenya_02)
Dalberg Data Insights Crop Type Uganda (ref_african_crops_uganda_01)
Great African Food Company Crop Type Tanzania (ref_african_crops_tanzania_01)
LandCoverNet (landcovernet_v1)
Open Cities AI Challenge (open_cities_ai_challenge)
PlantVillage Crop Type Kenya (ref_african_crops_kenya_01)
Semantic Segmentation of Crop Type in Ghana (su_african_crops_ghana)
Semantic Segmentation of Crop Type in South Sudan (su_african_crops_south_sudan)


### Fetch a Dataset

If you know the ID of a dataset, you can also fetch it directly using the `Dataset.fetch` method. This method returns a `Dataset` instance.

In [3]:
spacenet1_dataset = Dataset.fetch('spacenet1')

print(spacenet1_dataset.title)

Spacenet 1


### Get Dataset Collections

Once you have a dataset, you can list its collections. Datasets are comprised of 1 or more collections and each of these collections may contain source imagery, labels, or (rarely) both.

You can access all collections associated with a dataset using the `collections` property. If you want to access only collections of a certain type, you can use either `collections.source_imagery` or `collections.labels`.

In [4]:
# The SpaceNet 1 dataset contains only a single collection...
print(f'Total Collections: {spacenet1_dataset.collections}')

# ...that catalogs both source imagery and labels
print(f'Source Imagery Collections: {spacenet1_dataset.collections.source_imagery}')
print(f'Labels Collections: {spacenet1_dataset.collections.labels}')

# Note that the IDs are identical,
# and that len(dataset.collections) != len(dataset.collections.source_imagery) + len(dataset.collections.labels)


Total Collections: [<Collection id=sn1_AOI_1_RIO>]
Source Imagery Collections: [<Collection id=sn1_AOI_1_RIO>]
Labels Collections: [<Collection id=sn1_AOI_1_RIO>]


Each of these collections is a `radiant_mlhub.Collection` instance. In the next section, we walk through how to work with these `Collection` instances.

## `Collection` Class

Using the `radiant_mlhub.Collection` class is the recommended method for working with Collections from the Radiant MLHub API (see [Low-Level Client](#Low-Level-Client) docs below for how to work with these Collections as Python data types).

The `radiant_mlhub.Collection` class inherits from the [`pystac.Collection` class](https://pystac.readthedocs.io/en/latest/api.html#collection) and adds a few convenience methods for working with the Radiant MLHub API:

* `Collection.list`: A class method for listing the collections available from the API
* `Collection.fetch`: A class method for fetching a collection from the API by ID.

### List Collections

The `Collection.list` method is a generator that yields `Collection` instances. We can use the attributes provided by `pystac.Collection` to inspect the collection.

In [5]:
collections = Collection.list()

# Print info for the first 10 collections
for collection in it.islice(collections, 10):
    print(f'{collection.id}: {collection.description}')


ref_african_crops_kenya_01_labels: African Crops Kenya
ref_african_crops_kenya_01_source: African Crops Kenya Source Imagery
ref_african_crops_tanzania_01_labels: African Crops Tanzania
ref_african_crops_tanzania_01_source: African Crops Tanzania Source Imagery
ref_african_crops_uganda_01_labels: African Crops Uganda
ref_african_crops_uganda_01_source: African Crops Uganda Source Imagery
microsoft_chesapeake_landsat_leaf_off: Microsoft Chesapeake Landsat 8 Leaf-Off Composite
microsoft_chesapeake_buildings: Microsoft Chesapeake Buildings
sn4_AOI_6_Atlanta: SpaceNet 4 Atlanta Chipped Training Dataset
ref_african_crops_kenya_02_labels: No Description


### Fetch a Collection

If you have the ID of a collection, you can also fetch it directly.

In [6]:
bigearthnet_labels = Collection.fetch('bigearthnet_v1_labels')

pprint(bigearthnet_labels.to_dict())

{'assets': {},
 'description': 'BigEarthNet v1.0',
 'extent': {'spatial': {'bbox': [[-9.00023345437725,
                                  1.7542686833884724,
                                  83.44558248555553,
                                  68.02168200047284]]},
            'temporal': {'interval': [['2017-06-13T10:10:31Z',
                                       '2018-05-29T11:54:01Z']]}},
 'id': 'bigearthnet_v1_labels',
 'keywords': [],
 'license': 'CDLA-Permissive-1.0',
 'links': [{'href': 'https://api.radiant.earth/mlhub/v1/collections/bigearthnet_v1_labels',
            'rel': 'self',
            'type': 'application/json'},
           {'href': 'https://api.radiant.earth/mlhub/v1',
            'rel': 'root',
            'type': 'application/json'}],
 'properties': {},
 'providers': [{'name': 'BigEarthNet',
                'roles': ['processor', 'licensor'],
                'url': 'https://api.radiant.earth/mlhub/v1/download/gAAAAABgIX8K2iFTj0GC3CNdQ3_8L5bV8f5WLtm49yMoHlm89N6EjB

### Download a Collection Archive

The simplest way to get assets (imagery and/or labels) associated with a Collection is to download the full archive for that Collection. Collection archives are gzipped tarballs containing all assets for a given collection. You can download these archives using the `Collection.download` method:

**Note that if you are running this notebook remotely using Binder this archive will be downloaded to the remote file system and not your local machine. To download locally, clone the repo and run this notebook locally.**

In [7]:
# Download to the current working directory
archive_path = bigearthnet_labels.download('.')

# Print the path and file size
size_gb = round(archive_path.stat().st_size / 1000000., 1)
print(f'{str(archive_path)} ({size_gb} MB)')

  0%|          | 0/173.0 [00:00<?, ?M/s]

/Users/jduckworth/Code/ml-hub/radiant-mlhub/examples/bigearthnet_v1_labels.tar.gz (173.0 MB)


## Low-Level Client

The low-level client functions also provide a way of interacting with the Radiant MLHub API `/collections` and `/dataset` endpoints using Python. These methods return native Python data types (e.g. `list`, `dict`, etc.) rather than the `Collection` and `Datast` instances documented above.

All low-level client functions are contained in the `radiant_mlhub.client` module (imported above). All of these methods accept the `profile` and `api_key` keyword arguments, which are passed directly to `radiant_mlhub.get_session`, if provided.

### List Datasets

You can use the `list_datasets` method to loop through all of the available datasets. This method makes requests to the `/datasets` endpoint, which returns paginated responses (with a `next` link). The `list_datasets` method will continue to make requests for the next page of responses, as needed, and yields a dictionary for each dataset object.

In [8]:
datasets = client.list_datasets()

first_dataset = datasets[0]
pprint(first_dataset)

{'collections': [{'id': 'bigearthnet_v1_source', 'types': ['source_imagery']},
                 {'id': 'bigearthnet_v1_labels', 'types': ['labels']}],
 'id': 'bigearthnet_v1',
 'title': 'BigEarthNet'}


### Fetch Dataset

You can use the `get_dataset` method to fetch a dataset by ID. This method returns a Python dictionary representing the dataset object.

In [9]:
bigearthnet_dataset = client.get_dataset('bigearthnet_v1')
pprint(bigearthnet_dataset)

{'collections': [{'id': 'bigearthnet_v1_source', 'types': ['source_imagery']},
                 {'id': 'bigearthnet_v1_labels', 'types': ['labels']}],
 'id': 'bigearthnet_v1',
 'title': 'BigEarthNet'}


### List Collections

You can use the `radiant_mlhub.client.list_collections` method to loop through all of the collections available through the Radiant MLHub API. This method makes requests to the `/collections` endpoint, which returns paginated responses. The `list_collections` method will make paginated requests to the endpoint to retrieve all collections and will yield these collections as Python dictionaries.

In [10]:
collections = client.list_collections()

for collection in it.islice(collections, 10):
    print(collection['id'] + ": " + collection['description'])

ref_african_crops_kenya_01_labels: African Crops Kenya
ref_african_crops_kenya_01_source: African Crops Kenya Source Imagery
ref_african_crops_tanzania_01_labels: African Crops Tanzania
ref_african_crops_tanzania_01_source: African Crops Tanzania Source Imagery
ref_african_crops_uganda_01_labels: African Crops Uganda
ref_african_crops_uganda_01_source: African Crops Uganda Source Imagery
microsoft_chesapeake_landsat_leaf_off: Microsoft Chesapeake Landsat 8 Leaf-Off Composite
microsoft_chesapeake_buildings: Microsoft Chesapeake Buildings
sn4_AOI_6_Atlanta: SpaceNet 4 Atlanta Chipped Training Dataset
ref_african_crops_kenya_02_labels: No Description


### Fetch a Collection

You can use the `get_collection` method to fetch a collection by ID. This method returns a Python dictionary representing the collection object.

In [11]:
bigearthnet_v1_source = client.get_collection('bigearthnet_v1_source')
pprint(bigearthnet_v1_source)

{'description': 'BigEarthNet v1.0',
 'extent': {'spatial': {'bbox': [[-9.00023345437725,
                                  1.7542686833884724,
                                  83.44558248555553,
                                  68.02168200047284]]},
            'temporal': {'interval': [['2017-06-13T10:10:31Z',
                                       '2018-05-29T11:54:01Z']]}},
 'id': 'bigearthnet_v1_source',
 'keywords': [],
 'license': 'CDLA-Permissive-1.0',
 'links': [{'href': 'https://api.radiant.earth/mlhub/v1/collections/bigearthnet_v1_source',
            'rel': 'self',
            'title': None,
            'type': 'application/json'},
           {'href': 'https://api.radiant.earth/mlhub/v1',
            'rel': 'root',
            'title': None,
            'type': 'application/json'}],
 'properties': {},
 'providers': [{'description': None,
                'name': 'BigEarthNet',
                'roles': ['processor', 'licensor'],
                'url': 'https://api.radiant.ea