<img src='https://radiant-assets.s3-us-west-2.amazonaws.com/PrimaryRadiantMLHubLogo.png' alt='Radiant MLHub Logo' width='300'/>

# How to use the Radiant MLHub API to browse and download the LandCoverNet South America dataset

This Jupyter notebook, which you may copy and adapt for any use, shows basic examples of how to use the API to download labels and source imagery for the LandCoverNet dataset. Full documentation for the API is available at [docs.mlhub.earth](http://docs.mlhub.earth).

We'll show you how to set up your authorization, list collection properties, and retrieve the items (the data contained within them) from those collections.

Each item in our collection is explained in json format compliant with STAC label extension definition.

## Citation

Radiant Earth Foundation (2022) "LandCoverNet South America: A Geographically Diverse Land Cover Classification Training Dataset", Version 1.0, Radiant MLHub. [Date Accessed] [https://doi.org/10.34911/rdnt.6a27yv](https://doi.org/10.34911/rdnt.6a27yv)

## Dependencies

This notebook utilizes the [`radiant-mlhub` Python client](https://pypi.org/project/radiant-mlhub/) for interacting with the API. See the official [`radiant-mlhub` docs](https://radiant-mlhub.readthedocs.io/) for more documentation of the full functionality of that library.

Please see the [`mlhub-tutorials README.md`](https://github.com/radiantearth/mlhub-tutorials/blob/Fix/version-pinning/README.md) for information on how to install dependencies for the noteboooks in this repository. 

## Authentication

### Create an API Key

Access to the Radiant MLHub API requires an API key. To get your API key, go to [mlhub.earth](https://mlhub.earth/). If you have not used Radiant MLHub before, you will need to sign up and create a new account. Otherwise, sign in. In the **API Keys** tab, you'll be able to create API key(s), which you will need. *Do not share* your API key with others: your usage may be limited and sharing your API key is a security risk.

### Configure the Client

Once you have your API key, you need to configure the `radiant_mlhub` library to use that key. There are a number of ways to configure this (see the [Authentication docs](https://radiant-mlhub.readthedocs.io/en/latest/authentication.html) for details). 

For these examples, we will set the `MLHUB_API_KEY` environment variable. Run the cell below to save your API key as an environment variable that the client library will recognize.

*If you are running this notebook locally and have configured a profile as described in the [Authentication docs](https://radiant-mlhub.readthedocs.io/en/latest/authentication.html), then you do not need to execute this cell.*


In [1]:
import os
from radiant_mlhub import Dataset

os.environ['MLHUB_API_KEY'] = 'YOUR API KEY'

## Listing Collection Properties

The following cell makes a request to the API for the properties for the LandCoverNet labels collection and prints out a few important properties.

In [2]:
dataset = Dataset.fetch('ref_landcovernet_sa_v1')

print(f'Title: {dataset.title}')
print(f'DOI: {dataset.doi}')
print(f'Citation: {dataset.citation}')
print('\nCollection IDs and License:')
for collection in dataset.collections:
    print(f'    {collection.id} : {collection.license}')

Title: LandCoverNet South America
DOI: 10.34911/rdnt.6a27yv
Citation: Radiant Earth Foundation (2022) "LandCoverNet South America: A Geographically Diverse Land Cover Classification Training Dataset", Version 1.0, Radiant MLHub. [Date Accessed] https://doi.org/10.34911/rdnt.6a27yv

Collection IDs and License:
    ref_landcovernet_sa_v1_source_sentinel_2 : CC-BY-4.0
    ref_landcovernet_sa_v1_source_sentinel_1 : CC-BY-4.0
    ref_landcovernet_sa_v1_source_landsat_8 : CC-BY-4.0
    ref_landcovernet_sa_v1_labels : CC-BY-4.0


## Downloading Assets

> **NOTE:** If you are running these notebooks using Binder these resources will be downloaded to the remote file system that the notebooks are running on and **not to your local file system.** If you want to download the files to your machine, you will need to clone the repo and run the notebook locally.

This next cell will call the dataset download function with a filter specified which only downloads the `labels` assets within the `ref_landcovernet_sa_v1_labels` collection. For more information about filtering downloads, reference the [collection and asset key filtering method in the Python client documenation](https://radiant-mlhub.readthedocs.io/en/latest/datasets.html#filter-by-collection-and-asset-keys).

In [3]:
aoi = {
  "type": "Feature",
  "properties": {},
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -74.970703125,
          -49.32512199104001
        ],
        [
          -66.26953125,
          -49.32512199104001
        ],
        [
          -66.26953125,
          -21.207458730482642
        ],
        [
          -74.970703125,
          -21.207458730482642
        ],
        [
          -74.970703125,
          -49.32512199104001
        ]
      ]
    ]
  }
}

In [4]:
asset_filter = dict(
    ref_landcovernet_sa_v1_labels=['labels']
)

dataset.download(intersects=aoi, collection_filter=asset_filter)

ref_landcovernet_sa_v1: fetch stac catalog: 57255KB [00:17, 3188.97KB/s]       
INFO:radiant_mlhub.client.catalog_downloader:unarchive ref_landcovernet_sa_v1.tar.gz ...
unarchive ref_landcovernet_sa_v1.tar.gz: 100%|█| 345596/345596 [00:41<00:00, 82
INFO:radiant_mlhub.client.catalog_downloader:create stac asset list (please wait) ...
INFO:radiant_mlhub.client.catalog_downloader:1488780 unique assets in stac catalog.
INFO:radiant_mlhub.client.catalog_downloader:filter by collection ids and asset keys
filter by collection ids and asset keys: 1057999872it [00:01, 1438185980.70it/s]INFO:radiant_mlhub.client.catalog_downloader:1200 assets after collection filter.
filter by collection ids and asset keys: 1087640519it [00:04, 238687359.84it/s] 
INFO:radiant_mlhub.client.catalog_downloader:filter by intersects
filter by intersects:   0%|                           | 0/1200 [00:00<?, ?it/s]INFO:radiant_mlhub.client.catalog_downloader:166 assets after intersects filter.
filter by intersects: 3424i

### Download All Assets

If you needed an entire dataset instad of a subset, run the `dataset.download` function to download the `ref_landcovernet_sa_v1` dataset to the current working directory.

The line of code for the entire dataset download for the `ref_landcovernet_sa_v1` in this notebook would be:  
`dataset.download()`

However, this dataset is about 215 GB, so we do not reccomend downloading this entire dataset unless you are sure you have the storage space and need for the complete dataset. 