# CWL Workflow for Training 
This notebook provide an step-by-step instruction to explain how to wrap the `training` step as a Common Workflow Language workflow and execute it with two CWL runners such as `cwltool` and `calrissian`.

> Note: Before proceeding, make sure to select the correct kernel. In the top-right corner of the notebook, choose the Jupyter kernel named `Bash`.

## Setup

In [1]:
export WORKSPACE=/workspace/machine-learning-process
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}
printenv | grep RUNTIME
pwd

XDG_RUNTIME_DIR=/workspace/.local
RUNTIME=/workspace/machine-learning-process/runs
/workspace/machine-learning-process/runs


## Inspect `tile-sat-training`

The CWL document below defines the `tile-sat-training` workflow. In the section below, the user will explore the inputs that must be provided to this CWL workflow.


In [2]:
yq '.["$graph"][0].inputs' ${WORKSPACE}/training/app-package/tile-sat-training.cwl

[1;39m{
  [0m[34;1m"MLFLOW_TRACKING_URI"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"label"[0m[1;39m: [0m[0;32m"MLFLOW_TRACKING_URI"[0m[1;39m,
    [0m[34;1m"type"[0m[1;39m: [0m[0;32m"string"[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"stac_reference"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"label"[0m[1;39m: [0m[0;32m"stac_reference"[0m[1;39m,
    [0m[34;1m"doc"[0m[1;39m: [0m[0;32m"STAC Item label url"[0m[1;39m,
    [0m[34;1m"type"[0m[1;39m: [0m[0;32m"string"[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"BATCH_SIZE"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"label"[0m[1;39m: [0m[0;32m"BATCH_SIZE"[0m[1;39m,
    [0m[34;1m"default"[0m[1;39m: [0m[0;39m4[0m[1;39m,
    [0m[34;1m"doc"[0m[1;39m: [0m[0;32m"BATCH_SIZE- model metadata"[0m[1;39m,
    [0m[34;1m"type"[0m[1;39m: [0m[0;32m"int[]"[0m[1;39m
  [1;39m}[0m[1;39m,
  [0m[34;1m"CLASSES"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"label"[0m[1;39m: [0m[0;32m"CLASSES"[0m[1;

Inspect the docker refrence

In [3]:
yq '.["$graph"][] | select(.class == "CommandLineTool") | .hints.DockerRequirement.dockerPull' ${WORKSPACE}/training/app-package/tile-sat-training.cwl

[0;32m"ghcr.io/eoap/machine-learning-process/training@sha256:cbb97e479c9c5ca3b15257d034b0fce4ac5cba4e60e4b128b0fbe18f657a743f"[0m


Updating the docker refrence with the latest verion

In [4]:
VERSION=$(curl -s https://api.github.com/repos/eoap/machine-learning-process/releases/latest | jq -r '.tag_name')
curl -L -o ${WORKSPACE}/training/app-package/tile-sat-training.cwl \
  "https://github.com/eoap/machine-learning-process/releases/download/${VERSION}/tile-sat-training.${VERSION}.cwl"

echo "Updated DockerPull: " && yq '.["$graph"][] | select(.class == "CommandLineTool") | .hints.DockerRequirement.dockerPull' ${WORKSPACE}/training/app-package/tile-sat-training.cwl


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  5546  100  5546    0     0   6044      0 --:--:-- --:--:-- --:--:-- 3261k
Updated DockerPull: 
[0;32m"ghcr.io/eoap/machine-learning-process/training@sha256:cbb97e479c9c5ca3b15257d034b0fce4ac5cba4e60e4b128b0fbe18f657a743f"[0m


## Train a tile-based classifier with EuroSAT
In the cells below, the user have two options to run the `tile-sat-training` application package including:
- cwltool
- Calrissian

In [None]:
cwltool \
    --debug \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/training/app-package/tile-sat-training.cwl#tile-sat-training \
    ${WORKSPACE}/practice-labs/3-CWL-Workflows/params_training.yaml 

In [None]:
calrissian --debug \
    --stdout /calrissian/out.json \
    --stderr /calrissian/stderr.log \
    --usage-report /calrissian/report.json \
    --parallel \
    --max-ram 10G \
    --max-cores 2 \
    --tmp-outdir-prefix /calrissian/tmp/ \
    --outdir ${WORKSPACE}/runs \
    --tool-logs-basepath /calrissian/logs \
    ${WORKSPACE}/training/app-package/tile-sat-training.cwl#tile-sat-training \
    ${WORKSPACE}/practice-labs/3-CWL-Workflows/params_training.yaml 

List the outputs:

In [13]:
tree ${WORKSPACE}/runs

/workspace/machine-learning-process/runs
└── train.log

0 directories, 1 file


The user may train several tile-based classifiers using the `tile-based-training` module. One of the tracked artifacts through MLflow is the model's weights. The next step is to retrieve the best model, based on the desired evaluation metric, from the MLflow artifact registry and convert it to the ONNX format. This activity is explained in ["Export the Best Model to ONNX Format"](./ExtractModel.ipynb). Finally, this model can be integrated into the inference application package.

> **Note:** This process has already been completed. However, users may need to repeat it with their own candidate models.


## Clean-up 

In [None]:
rm -fr ${RUNTIME}
#### Un comment the line below to remove the docker image
# docker rmi -f $(docker images -aq)