<img src='https://radiant-assets.s3-us-west-2.amazonaws.com/PrimaryRadiantMLHubLogo.png' alt='Radiant MLHub Logo' width='300'/>

# How to use the Radiant MLHub API to browse and download a sample dataset



This Jupyter notebook, which you may copy and adapt for any use, utilizes the [`radiant-mlhub` Python client](https://pypi.org/project/radiant-mlhub/) to show examples of how to download labels and source imagery for the NASA Flood Extent Detection dataset. 

We'll show you how to set up your authorization, explore the dataset properties, and retrieve the data from Radiant MLHub.

Radiant MLHub uses [STAC](https://stacspec.org/) standard for cataloging training datasets. Each item in our collections are explained in json format compliant with STAC [label extension](https://github.com/radiantearth/stac-spec/tree/master/extensions/label) definition.

## Authentication

### Create an API Key

Access to the Radiant MLHub API requires an API key. To get your API key, go to [mlhub.earth](https://mlhub.earth/). If you have not used Radiant MLHub before, you will need to sign up and create a new account. Otherwise, sign in. In the **API Keys** tab, you'll be able to create API key(s), which you will need. *Do not share* your API key with others: your usage may be limited and sharing your API key is a security risk.

### Configure the Client

Once you have your API key, you need to configure the `radiant_mlhub` library to use that key. There are a number of ways to configure this (see the [Authentication docs](https://radiant-mlhub.readthedocs.io/en/latest/authentication.html) for details). 

For these examples, we will set the `MLHUB_API_KEY` environment variable. Replace 'YOUR API KEY' with your API key and run the cell below to save your API key as an environment variable that the client library will recognize.

*If you are running this notebook locally and have configured a profile as described in the [Authentication docs](https://radiant-mlhub.readthedocs.io/en/latest/authentication.html), then you do not need to execute this cell.*

In [None]:
import os
from radiant_mlhub import Dataset, Collection
from dateutil.parser import parse

os.environ['MLHUB_API_KEY'] = 'YOUR API KEY'

In [None]:
# List of all Datasets
datasets = Dataset.list()
for dataset in datasets:
    print(dataset)

## Listing Dataset Properties

The following cell makes a request to the API for the properties for the NASA Flood Extent Detection Dataset and prints out a few important properties.

In [None]:
dataset = Dataset.fetch('nasa_floods_v1')

print(f'Title: {dataset.title}')
print(f'DOI: {dataset.doi}')
print(f'Citation: {dataset.citation}')
print('\nCollection IDs and License:')
for collection in dataset.collections:
    print(f'    {collection.id} : {collection.license}')

## Downloading Assets

This next cell will call the dataset download function with a filter specified which only downloads the `raster_label` and `VV` source assets within the `nasa_floods_v1` collection. For more information about filtering downloads, reference the [collection and asset key filtering method in the Python client documenation](https://radiant-mlhub.readthedocs.io/en/latest/datasets.html#filter-by-collection-and-asset-keys).

This next section will download a subset of the `nasa_floods_v1` collection. The `Dataset.download()` function is called and utilizes all three of the filter options: spatial filter, temporal filter, and collection filter. For more information about the filtering downloads available through `radiant-mlhub`, reference the [collection and asset key filtering method in the Python client documenation](https://radiant-mlhub.readthedocs.io/en/latest/datasets.html#filter-by-collection-and-asset-keys).

In [None]:
# Defining the Area of Interest (aoi) for the spatial filter
aoi = {
  "type": "Feature",
  "properties": {},
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          88.79150390625,
          22.705255477207526
        ],
        [
          93.49365234375,
          22.705255477207526
        ],
        [
          93.49365234375,
          27.068909095463365
        ],
        [
          88.79150390625,
          27.068909095463365
        ],
        [
          88.79150390625,
          22.705255477207526
        ]
      ]
    ]
  }
}

In [None]:
# Defining the dates for the temporal filter
my_start_date=parse("2017-06-01T00:00:00+0000")
my_end_date=parse("2017-06-30T00:00:00+0000")

# Defining the desired collections for the collection filter
asset_filter = dict(
    nasa_floods_v1_labels=['raster_label'],
    nasa_floods_v1_source=['VV']
)

In [None]:
dataset.download(intersects=aoi, datetime=(my_start_date, my_end_date), collection_filter=asset_filter)

### Download All Assets

If no filters are defined, the `dataset.download` function will download the entire `nasa_floods_v1` dataset to the current working directory. If you would like to explore the entire dataset, run the cell below.

In [None]:
dataset.download()