# Harmonised API examples for downloading EUM / ESA / CMEMS data

    Version: 1.0
    Date:    27/09/2019
    Author:  Ben Loveday (Plymouth Marine Laboratory) and Hayley Evers-King (EUMETSAT)
    Credit:  This code was developed for EUMETSAT under contracts for the Copernicus 
             programme.
    License: This code is offered as open source and free-to-use in the public domain, 
             with no warranty.

**What is this notebook for?**

This script shows you some examples of how to download data from different sources using the harmonised data access API (HDA-API). The companion script *samples/How_To_Guide-Harmonized_Data_Access-v0.1.3.ipynb* shows how this works more explicitly, but for general use we have refactored the code into a series of functions that can be found here *ocean-wekeo-jpyhub/Hub_Tools/harmonised_data_access_api_tools.py*.


**What specific tools does this notebook use?**

The harmonised data access API

***

Python is divided into a series of modules that each contain a series of methods for specific tasks. The box below imports all of the modules we need to complete this task

In [None]:
import os
import sys
import json
import time
from zipfile import ZipFile
sys.path.append(os.path.dirname(os.getcwd()) + '/Hub_Tools/')
import harmonised_data_access_api_tools as hapi

WEkEO provides access to a huge number of datasets through its 'harmonised-data-access' API. This allows us to query the full data catalogue and download data quickly and directly onto our Jupyter Hub. You can search for what data is available here: https://www.wekeo.eu/dataset-navigator/start.

In order to use the HDA-API we need to provide some authentication credentials, which comes in the form of an api_key. You can get your key from here; https://www.wekeo.eu/api-keys. If you click on the 'show hidden keys' button at the bottom of the page it will reveal a number of keys. The one you need is in the top grey box, and is on the following line:

-H "Authorization: Basic "**YOUR API KEY**"

Replace "YOUR API KEY" below with what you copy from "**YOUR API KEY**" (N.B. you need to keep the quotation marks.)

We will also define a few other parameters including where to download the data to, and if we want the HDA-API functions to be verbose. **Lastly, we will tell the notebook where to find the query we will use to find the data.** These 'JSON' queries are what we use to ask WEkEO for data. They have a very specific form, but allow us quite fine grained control over what data to get. You can find the example one that we will use here: **JSON_templates/RGB/EO_EUM_DAT_SENTINEL-3_OL_1_EFR___.json**

In [None]:
api_key = "YOUR API KEY"
download_dir_path = "/home/jovyan/work/products"
JSON_query_dir = os.path.join(os.getcwd(),'JSON_templates')
verbose = False

Set a timer going so we can see how long a test download takes

In [None]:
t0 = time.time()

Each WEkEO hosted data set has a unique identifier. A number of these are show below as examples.

In [None]:
# example data sets available: codes from here >> https://www.wekeo.eu/dataset-navigator/start

# ---------------------------- CMEMS options --------------------------
#dataset_id = "EO:MO:DAT:SEAICE_GLO_SEAICE_L4_NRT_OBSERVATIONS_011_001" # CMEMS SEA-ICE
#dataset_id = "EO:MO:DAT:OCEANCOLOUR_GLO_CHL_L4_REP_OBSERVATIONS_009_093" # CMEMS OC-CCI CHL
dataset_id = "EO:MO:DAT:SST_GLO_SST_L4_NRT_OBSERVATIONS_010_001" # CMEMS OSTIA SST

# ---------------------------- C3S options ----------------------------
#dataset_id = "EO:ECMWF:DAT:ERA5_HOURLY_DATA_ON_SINGLE_LEVELS_1979_PRESENT" # ERA5

# ---------------------------- CAMS options ---------------------------
#dataset_id = "EO:ECMWF:DAT:CAMS_SOLAR_RADIATION_TIMESERIES" # SOLAR RAD

# ----------------- EUMETSAT COPERNICUS MARINE options ----------------
#dataset_id = "EO:EUM:DAT:SENTINEL-3:SR_1_SRA___" # SRAL L1B
#dataset_id = "EO:EUM:DAT:SENTINEL-3:SR_1_SRA_BS___" #SRAL L1BS
#dataset_id = "EO:EUM:DAT:SENTINEL-3:SR_2_WAT___" # SRAL L2
#dataset_id = "EO:EUM:DAT:SENTINEL-3:OL_1_ERR___" # OLCI ERR
#dataset_id = "EO:EUM:DAT:SENTINEL-3:OL_1_EFR___" # OLCI EFR
#dataset_id = "EO:EUM:DAT:SENTINEL-3:OL_2_WRR___" # OLCI WRR
#dataset_id = "EO:EUM:DAT:SENTINEL-3:OL_2_WFR___" # OLCI WFR
#dataset_id = "EO:EUM:DAT:SENTINEL-3:SL_1_RBT___" # SLSTR L1 RBT
#dataset_id = "EO:EUM:DAT:SENTINEL-3:SL_2_WST___" # SLSTR L2 WST

# ------------------------- EUMETSAT other options -------------------
#dataset_id = "EO:EUM:SV:EUMETSAT:V01"
                
# ------------------------- ESA MARINE options ------------------------
#dataset_id = "EO:ESA:DAT:SENTINEL-2:MSI1C"       # MSI L1C tested
#dataset_id = "EO:ESA:DAT:SENTINEL-1:L1_GRD"      # S1 L1 GRD tested
#dataset_id = "EO:ESA:DAT:SENTINEL-1:L1_SLC"      # S1 L1 SLC tested

In order to get the data we need to construct a JSON query to send to the WEkEO data server via the harmonised data access API. There are a number of ways to do this, but to facilitate easy construction and editing of the query, we have chosen to make a them as text files. The *../JSON_templates/* directory contains examples for all of the examples above. By default, we use an adaptation of the *dataset_id* ("colon" replaced with "underscore" with a .json extension) to refer to the relevant query file.

In [None]:
# find query file
JSON_query_file = os.path.join(JSON_query_dir,dataset_id.replace(':','_')+".json")
if not os.path.exists(JSON_query_file):
    print('Query file ' + JSON_query_file + ' does not exist')
    print('Script will stop after showing metadata, to aid in creating a query file.')
    need_meta = True
else:
    print('Found JSON query file, you may want to adapt it.')
    need_meta = False

Now we have our query file, we can get our data. You can find more detailed information in *samples/How_To_Guide-Harmonized_Data_Access-v0.1.3.ipynb*. We proceed as follows:

    i)   initialise a dictionary to hold all of our API variables.
    ii)  use our API key to get an access token for our data request
    iii) accept the WEkEO terms and conditions
    iv)  optionally print out meta data associated with dataset_id (this is very useful for creating queries).
    v)   load the query file into the notebook

In [None]:
# i) initialise
HAPI_dict = hapi.init(dataset_id, api_key, download_dir_path, verbose=verbose)
# ii) get token
HAPI_dict = hapi.get_access_token(HAPI_dict)
# iii) accept T&C
HAPI_dict = hapi.accept_TandC(HAPI_dict)
# iv) check meta data for the dataset product >> query generation
if need_meta:
    HAPI_dict = hapi.query_metadata(HAPI_dict)
    sys.exit()
# v) load the query
with open(JSON_query_file, 'r') as f:
    query = json.load(f)

print('--------------------------------')
print('Elapsed time: %s' % (time.time() - t0))
print('--------------------------------')

Now we can launch our query and get our data:

    vi)   launch the query, which will prepare our data
    vii)  wait for the data preparation to complete
    viii) get our list of results
    ix)   get our download links
    x)    download the data

In [None]:
# vi) launch job
HAPI_dict = hapi.launch_query(HAPI_dict, query)
# vii) wait for jobs to complete
hapi.check_job_status(HAPI_dict)
# viii) get the query results
HAPI_dict = hapi.get_results_list(HAPI_dict)
# ix) get the download links
HAPI_dict = hapi.get_download_links(HAPI_dict)
# x) download data
HAPI_dict = hapi.download_data(HAPI_dict, skip_existing=True)

print('--------------------------------')
print('Elapsed time: %s' % (time.time() - t0))
print('--------------------------------')

All done! Your data product should now be downloaded to the specified download path (*download_dir_path*)