# Sentinel-2 Sentinel-2 Workflow of Workflow

## Goal 

Prepare the CWL Workflow orchestrating a sub-workflow.

This notebook is linked to: 
https://eoap.github.io/mastering-app-package/cwl-workflow/scatter-cloud-native/

The Cloud native Workflow chains the `crop`, `norm_diff`, `otsu` and `stac` steps taking a single STAC item as input parameters:

- a SpatioTemporal Asset Catalog (STAC) Item
- a bounding box area of interest (AOI)
- the EPSG code of the bounding box area of interest
- a list of common band names (["`green`", "`nir`"])

CWL can run sub-workflows in a step.

We want to process a list of STAC Items and then generate a STAC catalog with several detected water bodies

## Setup

In [None]:
export WORKSPACE=/workspace/mastering-app-package
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

## CWL Workflow

We now have a `$graph` list and several CWL descriptions: one `Workflow` and four `CommandLineTool`:

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[].class' -

The `CommandLineTool` ids are the all the CommandLineTool created in the previous step:

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[2].id' -
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[3].id' -
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[4].id' -
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[5].id' -

The second Workflow is the Workflow chaining the `crop`, `norm_diff` and the `otsu` steps


In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[1]' -

There's an additional Workflow.

This Workflow has an additional requirement in the `requirements` list, the `SubworkflowFeatureRequirement` requirement.

It allows running a sub-workflow as a `Workflow` step.

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[0]' -

Let's look at the `inputs` element.

These are the Application Package inputs: 

- a list of SpatioTemporal Asset Catalog (STAC) Items: `stac_items` of type `string[]`
- a bounding box area of interest (AOI): `aoi`
- the EPSG code of the bounding box area of interest: `epsg`
- a list of common band names (["`green`", "`nir`"]): `bands`

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[0].inputs' -

The step `node_stac` this time takes the water bodies detected by the sub-workflow:

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[0].steps' -

Let's look at the `outputs` element.

The output is a STAC catalog, output id `stac_catalog` and its source comes from `node_stac`. 

`node_stac` is the last step of the `Workflow` that invokes the sub-workflow to detect water bodies in a single STAC Item:


In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[0].outputs' -

The first step, `crop`, applies the fan-out pattern on the input `stac_items` which is a list. 

The `in` mapping maps the step inputs to the Workflow inputs.

In [None]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl | yq e .'$graph[0].steps["node_water_bodies"]' -

## Running the Workflow

In [None]:
cwltool \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-workflow/app-water-bodies-cloud-native.cwl#water-bodies \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A" \
    --aoi="-121.399,39.834,-120.74,40.472" \
    --epsg "EPSG:4326" \
    --bands green \
    --bands nir