# Run the CWL Workflow with calrissian

## Goal

Run the `app-water-body-cloud-native.1.0.0.cwl` released application package using `calrissian`, a CWL runner for kubernetes.

While `cwltool` creates a container for each command line tool, `calrissian` creates a pod for each command line tool of the Workflow processing step.

This notebook is linked to https://eoap.github.io/mastering-app-package/kubernetes/calrissian/

## Setup

In [None]:
export WORKSPACE=/workspace/mastering-app-package
export RUNTIME=${WORKSPACE}/runs
rm -fr ${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

## Download the released Application package

In [None]:
version="1.0.0"

wget \
    -O ${WORKSPACE}/runs/app-water-bodies-cloud-native.${version}.cwl \
    https://github.com/eoap/mastering-app-package/releases/download/${version}/app-water-bodies-cloud-native.${version}.cwl

## Execute the Application Package

`calrissian` take a few arguments more than `cwltool`:

- `--max-ram` is the maximum amount of cluster RAM that the pods running command line tools can consume
- `--max-cores` is the maximum amount of cluster CPUs that the pods running command line tools can consume

`calrissian` requires setting:

- `--tmp-outdir-prefix` is the folder in a RWX kubernetes volumes where the command line tools running in pods will write temporary results
- `--outdir` is the folder in a RWX kubernetes volumes where the command line tools running in pods will write the results

`calrissian` may produce a resource consumption report in JSON if the `--usage-report` is set

`calrissian` may write the command line tool logs if the `--tool-logs-basepath` is set (a folder) 

Finally, `--pod-nodeselector` is set instructing `calrissian` on what k8s node pool the pods running the command line tools will be assigned to.

In [None]:
mkdir -p /calrissian/logs

version="1.0.0"

calrissian \
    --stdout /calrissian/results.json \
    --stderr /calrissian/app.log \
    --max-ram 4G \
    --max-cores "8" \
    --tmp-outdir-prefix /calrissian/tmp \
    --outdir /calrissian/results \
    --usage-report /calrissian/usage.json \
    --tool-logs-basepath /calrissian/logs \
    --pod-nodeselectors /etc/calrissian/pod-node-selector.yaml \
    /workspace/mastering-app-package/runs/app-water-bodies-cloud-native.${version}.cwl \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20210708_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20210718_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220524_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220514_0_L2A" \
    --stac_items "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220504_0_L2A" \
    --aoi="-121.399,39.834,-120.74,40.472" \
    --epsg "EPSG:4326"

While running open a Terminal and run 

```
watch -n 2 kubectl get pods
```



Inspect the results:

In [None]:
tree $( cat /calrissian/results.json | jq -r .stac_catalog.path )


## Submit a kubernetes job

Below the manifest for a kubernetes job to run `calrissian`:

In [3]:
cat ${WORKSPACE}/practice-labs/Kubernetes/k8s-job.yaml | yq e . -

cat: /workspace/mastering-app-package/practice-labs/Kubernetes/k8s-job.yaml: No such file or directory



Prepare the parameters

In [None]:
cat << EOF > /calrissian/params.yaml
stac_items:
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20210708_0_L2A"
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A"
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20210718_0_L2A"
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220524_0_L2A"
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220514_0_L2A"
- "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2A_10TFK_20220504_0_L2A"
aoi: -121.399,39.834,-120.74,40.472
epsg: "EPSG:4326"
EOFs

Put all needed files in the RWX volume:

In [None]:
cp /etc/calrissian/pod-node-selector.yaml /calrissian/pod-node-selector.yaml
cp ${WORKSPACE}/runs/app-water-bodies-cloud-native.${version}.cwl  /calrissian/app-water-bodies-cloud-native.${version}.cwl 

Submit the job:

In [None]:
kubectl apply -f ${WORKSPACE}/practice-labs/Kubernetes/k8s-job.yaml

Monitor the job until completion

In [None]:
kubectl wait --for=condition=complete --timeout=600s job/water-bodies-detection