# API Request for Connection to Sentinel Hub <img align="right" src="../Supplementary_data/DE_Africa_Logo_Stacked_RGB_small.jpg">

* **Products used:**
[Sentinel-2 image patches](http://bigearth.net/)

*Dataset and tool is external to the Digital Earth Africa platform.*


## Description 

This notebook shows basic examples of how to use the API to download labels and source imagery for the BigEarthNet dataset. Full documentation for the API is available at [docs.mlhub.earth](https://github.com/radiantearth/mlhub-tutorials). Each item in the collection is explained in json format compliant with STAC label extension definition.

By running this notebook you will be able to achive the following targets:
1.  Setting up API authorization, query on a list of available collections and datasets, and retrieve the items from each collection
2.  Connect to Radiant MLHub API for getting access to open Earth imagery training data and use it for machine learning applications
***

## How to Access RadiantEarth API and Collection 

Following sign up to Radiant MLHub API in order to get access to Open Library for Earth Observations Machine Learning available [here](https://www.mlhub.earth/) , the user will be allocated an account credentials with a dynamic dashboard that has links to the API, documentation and tutorial page with specific token the user.

## Getting started 

Checking user authentication, sending request to API, and getting response by granting a specific token 

In [26]:
import requests
ACCESS_TOKEN = 'PASTE_YOUR_ACCESS_TOKEN_HERE'

In [13]:
# these headers will be used in each request
headers = {
    'Authorization': f'Bearer {ACCESS_TOKEN}',
    'Accept':'application/json'
}

In [14]:
# get list of all collections
r = requests.get('https://api.radiant.earth/mlhub/v1/collections', headers=headers)
h = r.json()
collections = h['collections']

# print the list of collections 
for c in collections:
    print(f'ID:       {c["id"]}\nLicense:  {c.get("license", "N/A")}\nCitation: {c.get("sci:citation", "N/A")}\n')

ID:       ref_african_crops_uganda_01
License:  CC-BY-SA-4.0
Citation: Bocquet, C., Dalberg Data Insights. (2019) Dalberg Data Insights Uganda Crop Classification, Version 1. [Indicate subset used]. Radiant ML Hub. [Date Accessed]

ID:       microsoft_chesapeake_nlcd
License:  CDLA-permissive-1.0
Citation: Robinson C, Hou L, Malkin K, Soobitsky R, Czawlytko J, Dilkina B, Jojic N. Large Scale High-Resolution Land Cover Mapping with Multi-Resolution Data. Proceedings of the 2019 Conference on Computer Vision and Pattern Recognition (CVPR 2019).

ID:       ref_african_crops_tanzania_01
License:  CC-BY-SA-4.0
Citation: Great African Food Company. (2019) Great African Food Company Tanzania Ground Reference Crop Type Dataset, Version 1. [Indicate subset used]. Radiant ML Hub. [Date Accessed]

ID:       ref_african_crops_kenya_02_source
License:  CC-BY-SA-4.0
Citation: Radiant Earth Foundation (2020) CV4A Competition Kenya Crop Type Dataset, Version 1. [Indicate subset used]. Radiant ML Hub. 

In [15]:
# paste the id of the collection you are interested in here:
collectionId = 'ref_african_crops_kenya_01'
# use these optional parameters to control what items are returned. maximum limit is 10000
limit = 10
bounding_box = []
date_time = []

# retrieves the items and their metadata in the collection
r = requests.get(f'https://api.radiant.earth/mlhub/v1/collections/{collectionId}/items', params={'limit':limit, 'bbox':bounding_box,'datetime':date_time},headers=headers)
collection = r.json()

In [16]:
selected_item = None
assets = None
for feature in collection.get('features', []):
    selected_item = feature
    assets = list(feature.get('assets').keys())
    # For demo purposes we only want the first item
    break

In [17]:
import re

# List all assets which don't match the pattern "year_month_day_*"
for asset in assets:
    if not re.match('\d{4}_\d{2}_\d{2}_.*', asset):
        print(asset)

documentation
labels
property_descriptions


In [18]:
from urllib.parse import urlparse

def get_download_url(item, asset_key, headers):
    asset = item.get('assets', {}).get(asset_key, None)
    if asset is None:
        print(f'Asset "{asset_key}" does not exist in this item')
        return None
    r = requests.get(asset.get('href'), headers=headers, allow_redirects=False)
    return r.headers.get('Location')

def download_file(url):
    filename = urlparse(url).path.split('/')[-1]
    r = requests.get(url)
    f = open(filename, 'wb')
    for chunk in r.iter_content(chunk_size=512 * 1024): 
        if chunk:
            f.write(chunk)
    f.close()
    print(f'Downloaded {filename}')
    return 

In [19]:
download_file(get_download_url(selected_item, 'labels', headers))

Downloaded ref_african_crops_kenya_01_tile_001.geojson


In [20]:
download_file(get_download_url(selected_item, 'documentation', headers))
download_file(get_download_url(selected_item, 'property_descriptions', headers))

Downloaded Kenya_Documentation.pdf
Downloaded Kenya_properties.csv


In [21]:
import boto3
AWS_ACCESS_KEY_ID = 'PASTE_YOUR_AWS_ACCESS_KEY_ID'
AWS_SECRET_KEY = 'PASTE_YOUR_AWS_SECRET_KEY'

def download_s3_file(url, access_key, secret_key):
    parsed_url = urlparse(url)
    
    bucket = parsed_url.hostname.split('.')[0]
    path = parsed_url.path[1:]
    filename = path.split('/')[-1]
    
    s3 = boto3.client(
        's3',
        aws_access_key_id=AWS_ACCESS_KEY_ID,
        aws_secret_access_key=AWS_SECRET_KEY
    )
    
    s3.download_file(bucket, path, filename, ExtraArgs={'RequestPayer': 'requester'})
    print(f'Downloaded s3://{bucket}/{path}')

In [24]:
#there is a need for AWS access key and secret key before running this cell. 
true_color_asset_url = get_download_url(selected_item, '2019_07_31_tci', headers)
download_s3_file(true_color_asset_url, AWS_ACCESS_KEY_ID, AWS_SECRET_KEY)