# HR-VPP Harmonized Data Access (HDA) API demo

Author: T. Jacobs (VITO), on behalf of HR-VPP consortium

This notebook demonstrates the use of the WEkEO Harmonized Data Access (HDA) API and python client package,
for programmatic discovery, searching & downloading of HR-VPP and other, related datasets that are available on WEkEO.

Useful links & resources:  
- [HDA client package from ECMWF](https://github.com/ecmwf/hda)  
- [HDA API documentation](https://www.wekeo.eu/docs/harmonised-data-access-api)  
- [Swagger API doc for developers](https://wekeo-broker.apps.mercator.dpi.wekeo.eu/databroker/ui/)  

## Step 1: set up the python environment

In [1]:
import os, sys
import json
from IPython.display import HTML
from IPython.display import clear_output

import requests
import warnings
warnings.filterwarnings('ignore')

import getpass

In [2]:
!pip install -U hda
clear_output()

In [3]:
from hda import Client

## Step 2: set up your WEkEO user credentials  

The url, username and password are typically stored in the .hdarc file of your users' home folder.  
They can be specified when initializing the Client, as shown here.

In [4]:
username=input()

 demo4clms


In [5]:
password = getpass.getpass()

 ···········


## Step 3: discover the HR-VPP datasets on WEkEO

#### Define the HDA Client

In [6]:
c = Client(url='https://wekeo-broker.apps.mercator.dpi.wekeo.eu/databroker',user=username, password=password,timeout=15, sleep_max=20, retry_max=5, debug=False, quiet=True)

#### To find the dataset ID's, you can navigate the wekeo.eu/data portal, search for "hrvpp" and then copy the identifier from the dataset description

![WEkEO portal](WEkEO_portal_screenshot1.png)  
![Identifier code in the dataset description](WEkEO_portal_screenshot2_datasetID.png)

#### For convenience, the IDs for all available HR-VPP datasets are listed here:

In [7]:
## IDs for HR-VPP's datasets on WEkEO:
VEGETATION_INDICES='EO:EEA:DAT:CLMS_HRVPP_VI'

SEASONAL_TRAJECTORIES_UTM='EO:EEA:DAT:CLMS_HRVPP_ST'
SEASONAL_TRAJECTORIES_LAEA='EO:EEA:DAT:CLMS_HRVPP_ST-LAEA'

VPP_PARAMS_UTM='EO:EEA:DAT:CLMS_HRVPP_VPP'
VPP_PARAMS_LAEA='EO:EEA:DAT:CLMS_HRVPP_VPP-LAEA'

#### See the dataset description in detail

In [8]:
c.dataset(VEGETATION_INDICES)

{'abstract': 'Vegetation Indices (VI) comprises four daily vegetation indices (PPI, NDVI, LAI and FAPAR) and quality information, that are part of the Copernicus Land Monitoring Service (CLMS) HR-VPP product suite. \n\nThe 10m resolution, daily updated Plant Phenology Index (PPI), Normalized Difference Vegetation Index (NDVI), Leaf Area Index (LAI) and Fraction of Absorbed Photosynthetically Active Radiation (fAPAR) are derived from Copernicus Sentinel-2 satellite observations.\n\nThey are provided together with a related quality indicator (QFLAG2) that flags clouds, shadows, snow, open water and other areas where the VI retrieval is less reliable.\n\nThese Vegetation Indices are made available as a set of raster files with 10 x 10m resolution, in UTM/WGS84 projection corresponding to the Sentinel-2 tiling grid, for those tiles that cover the EEA38 countries and the United Kingdom and for the period from 2017 until today, with daily updates.\n\nThe Vegetation Indices are part of the pa

#### To know the options available for filtering data requests, see the dataset's metadata

For the Seasonal Trajectories dataset, we can filter the data based on bounding box (rectangle), time interval (date range), product type (PPI or QFLAG) and so on.

In [9]:
s=c.metadata(datasetId=SEASONAL_TRAJECTORIES_UTM)
print(json.dumps(s,indent=2))

{
  "datasetId": "EO:EEA:DAT:CLMS_HRVPP_ST",
  "parameters": {
    "boundingBoxes": [
      {
        "comment": "Bounding Box",
        "details": {
          "crs": "EPSG:4326",
          "extent": []
        },
        "isRequired": false,
        "label": "Bounding Box",
        "name": "bbox"
      }
    ],
    "dateRangeSelects": [
      {
        "comment": "Temporal interval to search",
        "details": {
          "defaultEnd": null,
          "defaultStart": null,
          "end": null,
          "start": null
        },
        "isRequired": false,
        "label": "Temporal interval to search",
        "name": "temporal_interval"
      },
      {
        "comment": "The dateTime when the resource described by the entry was created.",
        "details": {
          "defaultEnd": null,
          "defaultStart": null,
          "end": null,
          "start": null
        },
        "isRequired": false,
        "label": "processingDate",
        "name": "processingDate"
    

### Step 4: actual data requests to search for & download files

The JSON payload for the data request can be constructed from the dataset metadata.
However, it is easier to perform a first request on the wekeo.eu/data portal and then copy the API request code into a JSON file.  
![WEkEO portal API payload example](WEkEO_portal_screenshot3_APIrequest.png)

#### Load the API payload JSON file

In [10]:
with open('hda_st_example.json') as f:
    st_query=json.loads(f.read())

In [11]:
print(json.dumps(st_query,indent=2))

{
  "datasetId": "EO:EEA:DAT:CLMS_HRVPP_ST",
  "boundingBoxValues": [
    {
      "name": "bbox",
      "bbox": [
        5.139675926686226,
        52.17217026352144,
        5.8635591352846435,
        52.48021777571928
      ]
    }
  ],
  "dateRangeSelectValues": [
    {
      "name": "temporal_interval",
      "start": "2021-05-01T00:00:00.000Z",
      "end": "2021-05-11T00:00:00.000Z"
    }
  ],
  "stringChoiceValues": [
    {
      "name": "productType",
      "value": "PPI"
    }
  ]
}


#### Search for products matching the data request (query) parameters

In [12]:
matches = c.search(st_query)
print(matches)

SearchResults[items=4,volume=720.4M,jobId=2Tp4Zi3hwb6jhH29FamOKIZSk-c]


#### Download the matching product files

In [13]:
matches.download()

                                                   

### Step 5: repeat the above steps to download other relevant datasets

Here's an example for the Agrometeorological indicators from 1979 to present derived from reanalysis (AgERA5) temperature data.  

The same approach can be followed to download other Copernicus Land Service datasets, such as CORINE Land Cover.

In [14]:
# AgERA5 example
with open('hda_agera5_example.json') as f:
    agera5_query=json.loads(f.read())

In [15]:
matches = c.search(agera5_query)
print(matches)

SearchResults[items=1,volume=64.2K,jobId=bzW3jVrRXX_vyyOoamUOwrQA3C8]


In [61]:
matches.download()

                                           