![](https://github.com/destination-earth/DestinE-DataLake-Lab/blob/main/img/DestinE-banner.jpg?raw=true)

**Author**: EUMETSAT <br>
**Copyright**: 2024 EUMETSAT <br>
**Licence**: MIT <br>

# DEDL - Hook Tutorial

This notebook demonstrates how to use the Hook service.

Author: EUMETSAT

The detailed API and definition of each endpoint and parameters is available in the **OnDemand Processing API OData v1**  OpenAPI documentation found at:
https://odp.data.destination-earth.eu/odata/docs

Further documentation is available at:
    https://destine-data-lake-docs.data.destination-earth.eu/en/latest/dedl-big-data-processing-services/Hook-service/Hook-service.html

## Install python package requirements and import environment variables 

In [1]:
# Note: The destinelab python package (which helps with authentication) is available already if you are using Python DEDL kernel
# Otherwise, the destinelab python package can be installed by uncommenting the following line
# %pip install destinelab
# For the importing of environment variables using the load_dotenv(...) command 
%pip install python-dotenv


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
import os
import json
import requests
from dotenv import load_dotenv
from getpass import getpass
import destinelab as destinelab

# load environment variables from root .env file.
load_dotenv()

# Load other specific .env file
load_dotenv('./.env_tutorial')



True

## Authentification - Get token

In [3]:
# Enter DESP credentials.
# .env file (ignored by git) at root of workspace can be used to set common environment variables, e.g. credentials. Note: if values change you may need to restart your kernel.
# Otherwise, the user sets the variables manually here
DESP_USERNAME = os.getenv('DESP_USERNAME') or input("Please input your DESP username or email: ")
DESP_PASSWORD = os.getenv('DESP_PASSWORD') or getpass("Please input your DESP password: ")

token = destinelab.AuthHandler(DESP_USERNAME, DESP_PASSWORD)          
access_token = token.get_token()
 
# Check the status of the request
if access_token is not None:
    print("DEDL/DESP Access Token Obtained Successfully")
    #Save API headers
    api_headers = {'Authorization': 'Bearer ' + access_token}
else:
    print("Failed to Obtain DEDL/DESP Access Token") 

Response code: 200
DEDL/DESP Access Token Obtained Successfully


## Setup static variables

In [4]:
# Hook service url (ending with odata/v1/ - e.g. https://odp.data.destination-earth.eu/odata/v1/)
hook_service_root_url = "https://odp.data.destination-earth.eu/odata/v1/"


## List available workflows

Next we can check what possible workflows are available to us by using method   
```https://odp.data.destination-earth.eu/odata/v1/Workflows```

In [5]:
#Send request and return json object listing all provided workfows, ordered by Id
result = requests.get(hook_service_root_url+"Workflows?$orderby=Id asc", headers=api_headers).json()

#Print provided workflows name
print("List of available DEDL provided Hooks")
for i in range(len(result['value'])):            
    print(f"Name:{str(result['value'][i]['Name']).ljust(20, ' ')}DisplayName:{str(result['value'][i]['DisplayName'])}") # print JSON string 

List of available DEDL provided Hooks
Name:card_bs             DisplayName:Sentinel-1: Terrain-corrected backscatter
Name:lai                 DisplayName:Sentinel-2: SNAP-Biophysical
Name:odp-test            DisplayName:ODP Test
Name:card_cohinf         DisplayName:Sentinel-1 Coherence/Interferometry
Name:c2rcc               DisplayName:Sentinel-2: C2RCC
Name:copdem              DisplayName:Copernicus DEM Mosaic
Name:dedl_hello_world    DisplayName:DEDL Hello World
Name:data-harvest        DisplayName:Data harvest
Name:sen2cor             DisplayName:Sentinel-2: Sen2Cor
Name:maja                DisplayName:Sentinel-2: MAJA Atmospheric Correction


In [6]:
# Print result JSON object: containing provided workflow list
workflow_details = json.dumps(result, indent=4)
print(workflow_details)

{
    "@odata.context": "$metadata#Workflows/$entity",
    "value": [
        {
            "Id": "1",
            "Uuid": null,
            "Name": "card_bs",
            "DisplayName": "Sentinel-1: Terrain-corrected backscatter",
            "Documentation": "",
            "Description": "Sentinel-1 CARD BS (Copernicus Analysis Ready Data Backscatter) processor generates terrain-corrected geocoded Sentinel-1 Level 2 backscattering by removing the radiometric effect imposed by relief (provided by DEM). This allows comparability of images, e.g. for analysis of changes in land cover. This processor provided by the Joint Research Centre is based on a GPT graph that can be run with ESA SNAP.",
            "InputProductType": "GRD",
            "InputProductTypes": [
                "GRD",
                "IW_GRDH_1S",
                "IW_GRDM_1S",
                "EW_GRDH_1S",
                "EW_GRDM_1S",
                "WV_GRDM_1S",
                "GRD-COG",
                "IW_GRDH_

## Select a workflow and see parameters

If we want to see the details of a specific workflow, showing us the parameters that can be set for that workflow, we can add a filter to the query as follows:

```https://odp.data.destination-earth.eu/odata/v1/Workflows?$expand=WorkflowOptions&$filter=(Name eq data-harvest)```   

**\\$expand=WorkflowOptions** shows all parameters accepted by workflow   
**\\$filter=(Name eq data-harvest)** narrows the result to workflow called "data-harvest"

In [7]:
# Select workflow 
workflow = "data-harvest" # Here we set the data-haverst workflow

# Send request
result = requests.get(f"{hook_service_root_url}Workflows?$expand=WorkflowOptions&$filter=(Name eq '{workflow}')",headers=api_headers).json()
workflow_details = json.dumps(result, indent=4)
print(workflow_details) # print formatted workflow_details, a JSON string 



{
    "@odata.context": "$metadata#Workflows/$entity",
    "value": [
        {
            "Id": "11",
            "Uuid": null,
            "Name": "data-harvest",
            "DisplayName": "Data harvest",
            "Documentation": null,
            "Description": "Data-harvest is a workflow allows to download data from external sources. It requires URL to the external catalogue, credentials and data to download. The workflow is mainly used to download data from HDA (https://hda.data.destination-earth.eu/) using STAC.",
            "InputProductType": null,
            "InputProductTypes": [],
            "OutputProductType": null,
            "OutputProductTypes": [],
            "WorkflowVersion": "0.0.1",
            "WorkflowOptions": [
                {
                    "Name": "output_storage",
                    "Description": "Output storage type, with viable values being: 'PRIVATE', 'TEMPORARY'. If equal to 'PRIVATE' then all 'output_s3_*' parameters are required.",


## Order provided workflow: Data-Harvest

### Select workflow
workflow = "data-harvest". 

- Make an order to 'harvest data' using Harmonised Data Access API. 
- i.e. data from an input source can be transferred to a Private bucket or a Temporary storage bucket.

In [8]:
#Select workflow
workflow = "data-harvest"

### Name your order

In [9]:
# ID of the run. This can be used to easily identify the running process, and filter orders futher on.
order_name = os.getenv('ORDER_NAME') or input("Name your order: ")
print(f"order_name:{order_name}")


order_name:jess11


### Define output storage

In workflow parameters, among others values, storage to retreive the result has to be provided.  
**Two possibilites:**
1. Use your user storage 
2. Use a temporary storage 

#### 1. - Your user storage (provided by DEDL ISLET service)

Example using a S3 bucket created with ISLET Storage service  - result will be available in this bucket
> *workflow parameter: {"Name": "output_storage", "Value": "PRIVATE"}*

In [10]:
# Output storage - Islet service
# URL of the S3 endpoint in the Central Site 
output_storage_url = "https://s3.central.data.destination-earth.eu"
# name of the object storage bucket where the results will be stored
output_bucket = "your-bucket-name"
# Islet object storage credentials (openstack ec2 credentials)
output_storage_access_key = "your-access-key"
output_storage_secret_key = "your-secret-key"
output_prefix = "dedl_" + order_name

#### 2 - Use temporary storage

The result of processing will be stored in shared storage and download link provided in the output product details
> *workflow parameter: {"Name": "output_storage", "Value": "TEMPORARY"}*

### Define parameters and send order

In [11]:
#Data have been previously discovered and search
STAC_HDA_API_URL = "https://hda.data.destination-earth.eu/stac"



In [12]:
#Set collection where the item can be found
COLLECTION_ID = "EO.ESA.DAT.SENTINEL-2.MSI.L1C"
print(STAC_HDA_API_URL+"/collections/"+COLLECTION_ID)

#data to retreive
data_id = "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE"

#Build your order body : Using DESP source type for simplified configuration

order_body_custom_bucket = {
       "Name": "Tutorial " + workflow + " - " + order_name,
       "WorkflowName": workflow,
       "IdentifierList": [data_id],
       "WorkflowOptions":[
           {"Name": "output_storage", "Value": "TEMPORARY"},
           {"Name": "source_type", "Value": "DESP"},
           {"Name": "desp_source_username", "Value": DESP_USERNAME},
           {"Name": "desp_source_password", "Value": DESP_PASSWORD},
           {"Name": "desp_source_collection", "Value": "EO.ESA.DAT.SENTINEL-2.MSI.L1C"}
       ]
   }

#Send order
order_request = requests.post(hook_service_root_url+"BatchOrder/OData.CSC.Order",
                            json.dumps(order_body_custom_bucket),headers=api_headers).json()

#If code = 201, the order has been successfully sent

# Print order_request JSON object: containing order_request details
order_reques_details = json.dumps(order_request, indent=4)
print(order_reques_details)

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C
{
    "@odata.context": "#metadata/Odata.CSC.BatchOrder",
    "value": {
        "Name": "Tutorial data-harvest - jess11",
        "Priority": 1,
        "WorkflowName": "data-harvest",
        "NotificationEndpoint": null,
        "NotificationEpUsername": null,
        "NotificationStatus": null,
        "WorkflowOptions": [
            {
                "Name": "platform",
                "Value": "dedl"
            },
            {
                "Name": "version",
                "Value": "0.0.1"
            },
            {
                "Name": "output_storage",
                "Value": "TEMPORARY"
            },
            {
                "Name": "source_type",
                "Value": "DESP"
            },
            {
                "Name": "desp_source_username",
                "Value": "eum-dedl-user"
            },
            {
                "Name": "desp_source_password",
    

It is possible to order multiple product using endpoint: 
```https://odp.data.destination-earth.eu/odata/v1/BatchOrder/OData.CSC.Order```   

## Check The status of the order

Possible status
- queued (i.e. queued for treatment but not started)
- in_progress (i.e. order being treated)
- completed (i.e. order is complete and data ready)

In [13]:
requests_status = requests.get(hook_service_root_url + "ProductionOrders?$orderby=Id desc&$filter=(endswith(Name,'" + order_name + "'))", headers=api_headers).json()


for i in range(len(requests_status['value'])):            
    print(f"\norder id: {requests_status['value'][i]['Id']}")
    print(f"Status: {requests_status['value'][i]['Status']}")

requests_status #see requests status


order id: 25380
Status: queued

order id: 25379
Status: completed

order id: 25377
Status: completed

order id: 25376
Status: completed


{'@odata.context': '$metadata#ProductionOrder/$entity',
 'value': [{'Id': '25380',
   'Status': 'queued',
   'StatusMessage': 'request is queued for processing',
   'SubmissionDate': '2024-09-02T16:17:22.942Z',
   'Name': 'Tutorial data-harvest - jess11',
   'EstimatedDate': '2024-09-02T16:22:51.958Z',
   'InputProductReference': {'Reference': 'S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE',
    'ContentDate': None},
   'WorkflowOptions': [{'Name': 'platform', 'Value': 'dedl'},
    {'Name': 'version', 'Value': '0.0.1'},
    {'Name': 'output_storage', 'Value': 'TEMPORARY'},
    {'Name': 'source_type', 'Value': 'DESP'},
    {'Name': 'desp_source_username', 'Value': 'eum-dedl-user'},
    {'Name': 'desp_source_collection',
     'Value': 'EO.ESA.DAT.SENTINEL-2.MSI.L1C'}],
   'WorkflowName': 'data-harvest',
   'WorkflowId': 11,
   'Priority': 1,
   'NotificationEndpoint': None,
   'NotificationEpUsername': None,
   'NotificationStatus': None},
  {'Id': '25379',
   'Status

## Access workflow output

#### Private storage
Let us now check our private storage using this boto3 script.
You can also go and check this in the Islet service using the Horizon user interface

In [14]:
# import boto3

# s3 = boto3.client('s3',aws_access_key_id=output_storage_access_key, aws_secret_access_key=output_storage_secret_key, endpoint_url=output_storage_url,)

# paginator = s3.get_paginator('list_objects_v2')
# pages = paginator.paginate(Bucket=output_bucket, Prefix=output_prefix + '/')

# for page in pages:
#     try:
#         for obj in page['Contents']:
#             print(obj['Key'])
#     except KeyError:
#         print("No files exist")
#         exit(1)

### Temporary storage

In [21]:
# List order items within a production order

# Check Status again using order_name. This can give multiple order_ids (if hook executed mutliple times using same order_name)
requests_status = requests.get(hook_service_root_url + "ProductionOrders?$orderby=Id desc&$filter=(endswith(Name,'" + order_name + "'))", headers=api_headers).json()

# Iterate over requests_status values
for i in range(len(requests_status['value'])):            
    print(f"\norder id: {requests_status['value'][i]['Id']}")
    print(f"Status: {requests_status['value'][i]['Status']}")
    
    # order_id = input("Order id: ")

    # set order id
    order_id = requests_status['value'][i]['Id']
    # set status
    status = requests_status['value'][i]['Status']
    if status == 'completed':
        response = requests.get('https://odp.data.destination-earth.eu/odata/v1/BatchOrder('+order_id+')/Products', headers=api_headers).json()
        print(json.dumps(response, indent=4))
    else:
        print(f"Status for order:{order_id} is not 'completed'. status:{status}")


order id: 25380
Status: queued
Status for order:25380 is not 'completed'. status:queued

order id: 25379
Status: completed
{
    "@odata.context": "#metadata/OData.CSC.BatchorderItem",
    "value": [
        {
            "Id": 34079,
            "BatchOrderId": 25379,
            "InputProductReference": "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE",
            "SubmissionDate": "2024-09-02T16:05:42.497Z",
            "Status": "completed",
            "ProcessedName": "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE",
            "ProcessedSize": 779499257,
            "OutputUUID": null,
            "StatusMessage": "Processing finished successfully",
            "CompletedDate": "2024-09-02T16:10:33.081Z",
            "DownloadLink": "https://s3.central.data.destination-earth.eu/swift/v1/tmp-storage/20240902_34079_v3wQnhFs.zip?temp_url_sig=54f64fe8e64c65b819eb23931e6b3084e0eec2c5&temp_url_expires=1726503018",
            "NotificationStatus

In [16]:
# Download output product

# Retreive item id from previous items request and copy item ID of interest (from the list)

# result is stored in output.zip and number of transferred bytes is printed
#url = 'https://odp.data.destination-earth.eu/odata/v1/BatchOrder('+order_id+')/Product(YYYY)/$value'
#r = requests.get(url, headers=api_headers, allow_redirects=True)

#open('output.zip', 'wb').write(r.content)