HDA - Earthkit integration
==========================

This notebook shows a blueprint to chain WeKeo HDA, EOCanvas and Earthkit in an integrated workflow.

In [None]:
from eocanvas.api import Input, Config, ConfigOption
from eocanvas.datatailor.chain import Chain
from eocanvas.processes import DataTailorProcess
from hda import Client
import earthkit.data


Let's see a DataTailor example.
The URL for the input can be retrieved by using the HDA Python client.

In [None]:
c = Client()

q = {
  "dataset_id": "EO:EUM:DAT:SENTINEL-3:OL_2_WFR___",
  "dtstart": "2024-07-05T09:28:00.000Z",
  "dtend": "2024-07-05T09:30:00.000Z",
  "timeliness": "NT"
}

r = c.search(q)
url = r.get_download_urls()[0]

We can load a Data Tailor chain from a YAML file.
The inputs are configured using the results from the HDA request:

In [None]:
chain = Chain.from_file("olci_resample.yaml")
inputs = Input(key="img1", url=url)

process = DataTailorProcess(epct_chain=chain, epct_input=inputs)

Once the process has been configured, we can submit it.
Instead of plainly running it, we first retrieve a reference to the job (we'll need it later) and then run it with the "download" flag set to False.

In [None]:
job = process.submit()
process.run(job, download=False)

Now the job is completed. We can retrieve the download URL of the first result:

In [None]:
url = job.results[0].full_url

Finally, we use earthkit-data to load the product using a "url" type ofsource. 
You can see here why we needed the reference to the EOCanvas job: we use the authorization header to allow earthkit to download the data.

In [None]:
data = earthkit.data.from_source("url", url, http_headers=job.api.auth.header)

And now the data are locally available and can be manipulated as needed:

In [None]:
data.to_xarray()