<img src='./img/LogoWekeo_Copernicus_RGB_0.png' alt='' align='centre' width='10%'></img>

# WEkEO Harmonized Data Access (HDA) API - How-To

<a href='https://www.wekeo.eu/' target='_blank'>WEkEO</a> is the EU Copernicus DIAS (Data and Information Access Service) reference service for environmental data, virtual processing environments and skilled user support.

WEkEO offers access to a variety of data, including different parameters sensored from Sentinel-1, Sentinel-2 and Sentinel-3. It further offers access to climate reanalysis and seasonal forecast data.

The <a href='https://www.wekeo.eu/docs/harmonised-data-access-api' target='_blank'>Harmonised Data Access (HDA) API</a>, a REST interface, allows users to subset and download datasets from WEkEO.

This [notebook](hhttps://www.wekeo.eu/docs/using-jupyter) is a step-by-step guide on how to search for and download data from WEkEO using the `HDA API`.

Follow the following steps:

 - [1. Install the WEkEO HDA client](#wekeo_hda_install)
 - [2. Search for datasets on WEkEO](#wekeo_search)
 - [3. Get the Dataset ID](#wekeo_dataset_id)
 - [4. Configure the WEkEO API Authentication](#wekeo_hda_auth)
 - [5. Load data descriptor file and request data](#wekeo_json)
 - [6. Download requested data](#wekeo_download)

#### Load required libraries

In [1]:
import os
import sys
import json
import time
import base64
from IPython.core.display import HTML

import requests
import warnings
warnings.filterwarnings('ignore')

### <a id='wekeo_hda_install'></a>1. Install the WEkEO HDA client

The WEkEO HDA client is a Python 3 based library that facilitates access to WEkEO data. It requires the following packages.

   - Python 3
   - requests
   - tqdm

If you are working in the WEkEO JupyterLab - the HDA client is already installed, otherwise you can install it using the command below.

`pip install -U hda`

#### Load WEkEO HDA client

The hda client provides a fully compliant Python 3 client that can be used to search and download products using the Harmonized Data Access WEkEO API.
HDA is RESTful interface allowing users to search and download WEkEO datasets.
Documentation about its usage can be found at the <a href='https://www.wekeo.eu/' target='_blank'>WEkEO website</a> and the associated <a href='https://github.com/wekeo/hda' target = '_blank'>WEkEO GitHub repository</a>.

In [3]:
import hda

<hr>

### <a id='wekeo_search'></a>2. Search for datasets on WEkEO

Under <a href='https://wekeo.eu/data?view=catalogue' target='_blank'>WEkEO DATA</a>. Clicking the + to add a layer, opens a catalogue search where you can use free text.You can look for example for *`sentinel-3`* and browse through the results. You can also use the filter options on the left to refine your search and look by satellite plaform, sensor, Copernicus service, area (region of interest), general time period (past or future), as well as through a variety of flags.


You can click on the dataset you are interested in and you will be guided to a range of details including the dataset temporal and spatial extent, collection ID, and metadata.

<br>

<img src='./img/wekeo_data_search2.png' width='80%'></img>

### <a id='wekeo_dataset_id'></a>3. Get the Dataset ID 

The dataset description provides the following information:
- **Abstract**, containing a general description of the dataset,
- **Classification**, including the Dataset ID 
- **Resources**, such as a link to the Product Data Format Specification guide, and JSON metadata
- **Contacts**, where you can find further information about the data source from its provider.  

You need the `Dataset ID` to request data from the Harmonised Data Access API. 

For `OLCI Level 1B Full Resolution - Sentinel-3` data for example, the collection ID is `EO:EUM:DAT:SENTINEL-3:OL_1_EFR___`.

<br>

<img src='./img/wekeo_collection_id2.png' width='60%' />

<br>

Let's store the Dataset ID as a variable called `dataset_id` to be used later.

In [4]:
dataset_id = "EO:EUM:DAT:SENTINEL-3:OL_1_EFR___"

### <a id='wekeo_hda_auth'></a>4. Configure the WEkEO API Authentication

In order to allow us to download data using the WEkEO HDA API, we need to provide our credentials. We can do this in two ways;

* **Option 1** - by creating a configuration file (*recommended*)
* **Option 2** - by supplying our credentials directly in this script (*not recommended, but sometimes useful*)

#### Option 1: creating a credentials file.

By default, the HDA API expects the configuration to be called `.hdarc`, and to reside in our home directory. For most computer systems the home directory can be found at the path \user\username, /users/username, or /home/username depending on your operating system. In this file we need to add the following information exactly as follows;

```
user: <your_user_name>
password: <your_password>
```

You must replace `<your_user_name>` and `<your_password>` with the information from your WEkEO account (if you don't have one yet, register <a href="https://www.wekeo.eu/" target="_blank">here</a>. Once you have entered these credentials in the file, the `hda` client will automatically read in the credentials from the file when you use it. 

##### Creating a credentials file on the WEkEO JupyterLab

In the WEkEO JupyterLab, the easiest way to do this is to open a text file using the option in the Launcher (always accessible via the '+' on the top left). Then you can copy the text above into the file and adapt it with the credentials of your WEkEO account. Save the file with whatever name you like.

Once you've done this you can open a terminal and run the command below to move the file to the right location, and rename it:

`mv <your_file_name>  ~/.hdarc`

##### Creating a credentials file on your own system

You are free to use whatever method you like to create and save your file, but remember that, if you are using the default (~/.hdarc) then it should have no extension (Note: windows sometimes likes to add this without telling you!).

##### Using the HDA

If you are using the default approach you can establish an instance of the client like so;

`c = hda.Client()`

Alternatively, if you wish to specify your own configuration file, you can do so by adapting the code line below. The format should be the same as specified above.

`c = hda.Client(hda.Configuration(path=<your_config_file>))`

Where you should replace `<your_config_file>` with the path to your configuration file, for example;
* "myconfig.txt" if it is in this directory.
* "/users/username/myconfig.txt" as an example of an absolute path to a file on Linux and/or OSx.
* os.path.join("users","username","myconfig.txt") as an example of an absolute path on any platform.
* os.path.join(os.path.expanduser("~"), "myconfig.txt")) if it is in your home directory, on all operating systems.

#### Option 2: provide credentials directly

You can provide your credentials directly as follows;

`c = hda.Client(hda.Configuration(user="<your_user_name>",
                                 password="<your_password>"))`
                                 
*Note: this method is convenient in the short term, but is not really recommended as you have to put your user name and password in this notebook, and run the risk of accidentally sharing them. This method also requires you to authenticate on a notebook-by-notebook basis.*

In [5]:
c = hda.Client()

### <a id='wekeo_json'></a>5. Load data descriptor file and request data

The Harmonised Data Access API can read your data request from a `JSON` file. In this JSON-based file, you can describe the dataset you are interested in downloading. The file is in principle a dictionary. Following keys can be defined:
- `datasetID`: the dataset's collection ID
- `stringChoiceValues`: type of dataset, e.g. 'Non Time Critical'
- `dataRangeSelectValues`: time period you would like to retrieve data
- `boundingBoxValues`: optional to define a subset of a global field

See an example of a `data descriptor` file [here](./olci_data_descriptor.json). You can also get a specific example of a `JSON` file for a particular query from the <a href='https://wekeo.eu/data?view=catalogue' target='_blank'>WEkEO DATA</a> portal when you search as above - you just need to click on API request and the information needed for the `JSON` file will be displayed.

<br>

<img src='./img/Mindule_eg_viewer3.png' width='100%' />

<br>

<br>

<img src='./img/Mindule_API_request.png' width='60%' />
<br>

You can load the `JSON` file with `json.load()`. Alternatively, you can copy paste the dictionary describing your data into a cell, as done below. Make sure to choose one option if you are inputting your own example! 

In [6]:
try:
    with open('./olci_data_descriptor.json', 'r') as f:
        data = json.load(f)
    print('Your JSON file:')
    print(data)
except:
    print('Your JSON file is not in the correct format, or is not found, please check it!')

Your JSON file:
{'dataset_id': 'EO:EUM:DAT:SENTINEL-3:OL_1_EFR___', 'dtstart': '2021-09-29T00:00:00.000Z', 'dtend': '2021-09-30T00:00:00.000Z', 'bbox': [134.2477516188146, 22.00625971252353, 137.58597074610276, 25.18812983958223], 'type': 'OL_1_EFR___', 'timeliness': 'NT'}


In [None]:
query = {
  "dataset_id": "EO:EUM:DAT:SENTINEL-3:OL_1_EFR___",
  "dtstart": "2021-09-29T00:00:00.000Z",
  "dtend": "2021-09-30T00:00:00.000Z",
  "bbox": [
        134.2477516188146,
        22.00625971252353,
        137.58597074610276,
        25.18812983958223
  ],
  "type": "OL_1_EFR___",
  "timeliness": "NT"
}

### <a id='wekeo_download'></a>6. Download requested data

As a final step, you can use the WEkEO HDA API to request data from the datasets listed in the WEkEO catalogue and to download it. 

In [8]:
matches = c.search(query)
matches.download()

                                                                                                                                                                                                                                                              

<hr>
<a href="https://github.com/wekeo/wekeo4data" target="_blank">View on GitHub</a> | <a href=mailto:support@wekeo.eu target="_blank">Contact WEkEO for support </a> </span></p>