## General flow

Generating a WorldCereal product based on an existing model basically involves running the corresponding UDP on the CDSE openEO federation.
There are multiple ways to do this, depending on the user's needs and preferences, so consulting the [online documentation](https://documentation.dataspace.copernicus.eu/Applications/PlazaDetails/ExecuteService.html) is the best way forward. We will however elaborate a few approaches here, to illustrate the possibilities.

Note that all possibilities require that the user has either requested access to the service via ESA network of resources, or uses public service credits 
available on CDSE free of charge.

The WorldCereal VDM will allow to trigger processing via a web UI. This is a purpose built interface for WorldCereal, aiding the user in setting the correct
parameters. Similarly, the ESA APEx portal is also expected to offer a similar capability, but based on a generic user interface. The generated results can be downloaded as Geotiff, for visualization in QGis.

When the user wants to generate maps for larger areas, there is the option to use the APEx 'upscaling service', which is built for this purpose, or the user
can resort to using Python based tools such as GFMap to run a script that generates the map. In both cases, the actual execution again happens on CDSE openEO.

Finally, a Python notebook part of WorldCereal toolbox will show how to generate results.

## On premise execution

For users that wish to generate results on their own infrastructure, a local openEO deployment will be needed. 

To achieve this, the CDSE openEO backend software is also made available in a docker image, allowing to start an openEO application on a single node. This assumes that nodes are used with sufficient CPU and memory resources. Based on current experience, a node with 32 cores and 256GB of memory is sufficient to run the workflow,
this should match with commonly available server class hardware. The development of this docker image and supporting documentation is expected to be performed in the ESA EOEPCA project.

For data access, the easiest option is to provide a configuration that enables remote access to the Copernicus Dataspace Ecosystem catalog and object storage. Alternatively, users can set up local STAC catalogs, that mirror relevant parts of the Copernicus Dataspace Ecosystem catalog. The local openEO deployment can then reference the local
catalog and datasets. For mirroring of data, we refer to the [EO-DAG](https://eodag.readthedocs.io/) tool as one option which is recommended by the ESA EOEPCA project. Note however that efficient and production-ready data mirroring is beyond the scope of WorldCereal, so is the full responsibility of the user that wishes to operate processing locally.



## Production workflow

WorldCereal products are generated by openEO workflows. The workflow requires a trained CatBoost model, which is a parameter because users may want to use
their own models.

The pseudo code below outlines the general steps of the inference pipeline.

In [1]:
#| label: fig-inference
#| echo: true
#| fig-cap: "The WorldCereal inference pipeline"


import  openeo
from    openeo.rest.mlmodel import MlModel
from    openeo.processes import ProcessBuilder

connection = openeo.connect("openeo.dataspace.copernicus.eu")
l2A = connection.load_collection("SENTINEL2_L2A").aggregate_temporal_period(period="month",reducer="mean")  # <1>
sentinel1 = connection.load_collection("SENTINEL1_GRD")
bs = sentinel1.sar_backscatter(coefficient="sigma0-ellipsoid").resample_spatial(resolution=20).aggregate_temporal_period(period="month",reducer="mean")

# <2>

from    openeo import UDF
feature_udf=UDF(code="",runtime="Python") #load UDF to compute presto features based on monthly timeseries
features_cube = l2A.merge_cubes(bs).apply_dimension(dimension='t',process=feature_udf,target_dimension='bands')


model = MlModel.load_ml_model(connection=connection, id="http://myhost.com/my_catboost_stac_metadata.json")

catboost_classifier = lambda data, context: ProcessBuilder.process("predict_catboost",data=data, model=context)
worldcereal_product = features_cube.reduce_dimension(dimension="bands", reducer=catboost_classifier, context=model)

worldcereal_product

1. instead of aggregate_temporal, we'll do more advanced compositing, such as max-NDVI
2. we'll need to add agera5 and dem bands

## Exporting results to workspace

The openEO backend can store generated products directly in a custom object storage location.
This is an optional step, but convenient when trying to avoid copying around files.

Next to storing the file, it is also important to update and store the STAC metadata.

In [2]:
#| label: fig-export
#| echo: true
#| fig-cap: "Workflow steps to export results to object storage"

stac_metadata = worldcereal_product.save_result(format="GTiff")
stac_metadata = connection.datacube_from_process("stac_update",data = stac_metadata) #todo: add custom metadata

connection.datacube_from_process("export_workspace",data = stac_metadata, workspace = "my_workspace", merge="pointer_to_worldcereal_collection")