Import the modules 

In [1]:
from pycalrissian.context import CalrissianContext
from pycalrissian.job import CalrissianJob
from pycalrissian.execution import CalrissianExecution
import base64
import os
import yaml
from kubernetes.client.models.v1_job import V1Job


## Create the image pull secrets

There's one for docker.hub and one for Gitlab container registry as the CWL description to run refers container images published on those two container registries

In [3]:
username = ""
password = ""


auth = base64.b64encode(f"{username}:{password}".encode("utf-8")).decode(
    "utf-8"
)

secret_config = {
    "auths": {
        "registry.gitlab.com": {
            "auth": ""
        },
        "https://index.docker.io/v1/": {
			"auth": ""
		},

    }
}

In [4]:
secret_config

{'auths': {'registry.gitlab.com': {'auth': 'Z2l0bGFiK2RlcGxveS10b2tlbi0xODgxNzY4Oko1dHU4eXhkU3E0a19RMXpOOGhv'},
  'https://index.docker.io/v1/': {'auth': 'ZmFicmljZWJyaXRvOmRja3JfcGF0X1E3NFRFZWhBZWVsSW9BWUFLamVUcmlRTldqZw=='}}}

**Take away messages about image pull secrets**

* they're created as a dictionary with the same structure as your `~/.docker/config` file
* you can use the username/password pair or the auth string 

## Create the CalrissianContext

The CalrissianContext creates a kubernetes namespace on the cluster.

Note: our kubernetes cluster uses the `longhorn` RWX storage class, adapt it to your cluster configuration

### Using existing secrets

In addition to the `imagePullSecrets`, you can also specify an `additionalImagePullSecrets` key in the dictionary.

This `additionalImagePullSecrets` is an array of secrets defined in the same namespace where the ZOO-Project-DRU Helm chart is deployed.

Example: 

From the command line, you can create secret in any namespace.

````bash
# Creating a secret my-secret in the 
kubectl create secret docker-registry my-secret \
  --docker-email=tiger@acme.example \
  --docker-username=tiger \
  --docker-password=pass1234 \
  --docker-server=my-registry.example:5000 \
  -n given-namespace
````

With the secret defined, you can reference them from your CalrissianContext as presented below.


````python
#Creating a CalrissianContext using the this secret
session = CalrissianContext(
            namespace=namespace_name,
            storage_class="openebs-kernel-nfs-scw",
            volume_size="10G",
            image_pull_secrets={
                "imagePullSecrets": secret_config,
                "additionalImagePullSecrets": [
                    {"name": "my-secret"},
                    {"name": "my-other-secret"}
                ]
            },
)
````

This is useful if you have multiple secrets that you want to use to pull images from different registries.

The `additionalImagePullSecrets` key is optional and can be omitted if you only want to use the imagePullSecret.



In [None]:
namespace_name = "job-namespace-n"

session = CalrissianContext(
            namespace=namespace_name,
            storage_class="openebs-kernel-nfs-scw",
            volume_size="10G",
            image_pull_secrets={"imagePullSecrets": secret_config},
)



Now trigger the `CalrissianContext` initialisation with:

In [9]:
session.initialise()

2023-03-22 11:04:26.594 | INFO     | pycalrissian.context:initialise:65 - create namespace job-namespace-n
2023-03-22 11:04:26.642 | INFO     | pycalrissian.context:create_namespace:281 - creating namespace job-namespace-n
2023-03-22 11:04:31.757 | INFO     | pycalrissian.context:create_namespace:294 - namespace job-namespace-n created
2023-03-22 11:04:31.760 | INFO     | pycalrissian.context:initialise:82 - create role pod-manager-role
2023-03-22 11:04:36.912 | INFO     | pycalrissian.context:create_role:333 - role pod-manager-role created
2023-03-22 11:04:36.914 | INFO     | pycalrissian.context:initialise:91 - create role binding for role pod-manager-role
2023-03-22 11:04:42.069 | INFO     | pycalrissian.context:create_role_binding:373 - role binding pod-manager-default-binding created
2023-03-22 11:04:42.070 | INFO     | pycalrissian.context:initialise:82 - create role log-reader-role
2023-03-22 11:04:47.228 | INFO     | pycalrissian.context:create_role:333 - role log-reader-role c

## Read the CWL document

Now load a CWL document and create a dictionary with the parameters:


In [10]:
with open("../tests/app-s2-composites.0.1.0.cwl", "r") as stream:
    cwl = yaml.safe_load(stream)

params = {
    "post_stac_item": "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A", # noqa: E501
    "pre_stac_item": "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210703_0_L2A", # noqa: E501
    "aoi": "136.659,-35.96,136.923,-35.791",
}

**Take away messages**

* The CWL description is loaded into a Python dictionary
* The parameters are a Python dictionary

So you can discover the CWL Workflow parameters with something like:

In [11]:
cwl["$graph"][0]["inputs"]

{'pre_stac_item': {'doc': 'Pre-event Sentinel-2 item', 'type': 'string'},
 'post_stac_item': {'doc': 'Post-event Sentinel-2 item', 'type': 'string'},
 'aoi': {'doc': 'area of interest as a bounding box', 'type': 'string?'},
 'bands': {'type': 'string[]', 'default': ['B8A', 'B12', 'SCL']}}

## Create the `CalrissianJob`

In [12]:
os.environ["CALRISSIAN_IMAGE"] = "docker.io/terradue/calrissian:0.12.0"
job = CalrissianJob(
    cwl=cwl,
    params=params,
    runtime_context=session,
    cwl_entry_point="dnbr",
    max_cores=2,
    max_ram="4G",
    tool_logs=True,
)

2023-03-22 11:05:17.787 | INFO     | pycalrissian.job:__init__:68 - using default security context {'runAsUser': 0, 'runAsGroup': 0, 'fsGroup': 0}
2023-03-22 11:05:17.788 | INFO     | pycalrissian.job:__init__:79 - job name: job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70
2023-03-22 11:05:17.788 | INFO     | pycalrissian.job:__init__:80 - create CWL config map
2023-03-22 11:05:22.973 | INFO     | pycalrissian.context:create_configmap:467 - config map cwl-workflow created
2023-03-22 11:05:22.974 | INFO     | pycalrissian.job:__init__:82 - create processing parameters config map
2023-03-22 11:05:28.127 | INFO     | pycalrissian.context:create_configmap:467 - config map params created


The `CalrissianJob` object is constructed with:

* a CWL dictionary
* a parameters dictionaty
* a runtime context, a CalrissianContext object
* the maximum number of cores the pods can use 
* the maximum amount of RAM  the pods can use

The CalrissianJob can be serialized to a Kubernetes Job object:

In [13]:
isinstance(job.to_k8s_job(), V1Job)

2023-03-22 11:05:43.200 | INFO     | pycalrissian.job:_get_calrissian_container:417 - using Calrissian image: docker.io/terradue/calrissian:0.12.0


True

Or to a Kubernetes Job manifest in YAML:

In [14]:
job.to_yaml("job.yml")

2023-03-22 11:05:44.398 | INFO     | pycalrissian.job:_get_calrissian_container:417 - using Calrissian image: docker.io/terradue/calrissian:0.12.0
2023-03-22 11:05:44.407 | INFO     | pycalrissian.job:to_yaml:143 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 serialized to job.yml


**Note** The Calrissian pod image can be defined with the environment variable `CALRISSIAN_IMAGE`

At this stage, you could do `kubectl -n job-namespace apply -f job.yml` to submit the job on kubernetes

## Create the `CalrissianExecution` 


In [15]:
execution = CalrissianExecution(job=job, runtime_context=session)

Submit the job with:

In [16]:
execution.submit()

2023-03-22 11:05:49.797 | INFO     | pycalrissian.execution:submit:32 - submit job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70
2023-03-22 11:05:49.800 | INFO     | pycalrissian.job:_get_calrissian_container:417 - using Calrissian image: docker.io/terradue/calrissian:0.12.0
2023-03-22 11:05:49.869 | INFO     | pycalrissian.execution:submit:38 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 submitted


Monitor the execution with:

In [17]:
execution.monitor(interval=20)

2023-03-22 11:05:56.327 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:06:16.413 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:06:36.482 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:06:56.566 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:07:16.643 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:07:36.725 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8f8b48f93c70 is active
2023-03-22 11:07:56.812 | INFO     | pycalrissian.execution:monitor:210 - job job-1679483117788287-73087b7b-3801-4ef0-9662-8

Get the execution log

In [18]:
log = execution.get_log()
print(log)

[1;30mINFO[0m calrissian 0.12.0 (cwltool 3.1.20230201224320)
[1;30mINFO[0m Resolved '/workflow-input/workflow.cwl#dnbr' to 'file:///workflow-input/..2023_03_22_11_05_50.1680561479/workflow.cwl#dnbr'
../workflow-input/..2023_03_22_11_05_50.1680561479/workflow.cwl:9:7:  Source 'aoi' of type ["null",
                                                                      "string"] may be incompatible
../workflow-input/..2023_03_22_11_05_50.1680561479/workflow.cwl:51:9:   with sink 'aoi' of type
                                                                        "string"[0m
[1;30mINFO[0m [workflow ] starting step node_nbr
[1;30mINFO[0m [step node_nbr] start
[1;30mINFO[0m [workflow node_nbr] starting step node_stac_2
[1;30mINFO[0m [step node_stac_2] start
[1;30mINFO[0m [step node_stac_2] start
[1;30mINFO[0m [step node_stac_2] start
[1;30mINFO[0m [step node_nbr] start
[1;30mINFO[0m [workflow node_nbr_2] starting step node_stac_3
[1;30mINFO[0m [step node_stac_3] start

Get the usage report

In [19]:
usage = execution.get_usage_report()
usage

copy /calrissian/report.json to .
STDERR: tar: removing leading '/' from member names



{'cores_allowed': 2.0,
 'ram_mb_allowed': 4000.0,
 'children': [{'cpus': 1.0,
   'ram_megabytes': 268.435456,
   'disk_megabytes': 0.0,
   'exit_code': 0,
   'name': 'node_stac',
   'start_time': '2023-03-22T11:06:52+00:00',
   'finish_time': '2023-03-22T11:06:52+00:00',
   'elapsed_hours': None,
   'elapsed_seconds': 0.0,
   'ram_megabyte_hours': None,
   'cpu_hours': None},
  {'cpus': 1.0,
   'ram_megabytes': 268.435456,
   'disk_megabytes': 0.0,
   'exit_code': 0,
   'name': 'node_stac_2',
   'start_time': '2023-03-22T11:06:53+00:00',
   'finish_time': '2023-03-22T11:06:53+00:00',
   'elapsed_hours': None,
   'elapsed_seconds': 0.0,
   'ram_megabyte_hours': None,
   'cpu_hours': None},
  {'cpus': 1.0,
   'ram_megabytes': 268.435456,
   'disk_megabytes': 0.0,
   'exit_code': 0,
   'name': 'node_stac_3',
   'start_time': '2023-03-22T11:06:55+00:00',
   'finish_time': '2023-03-22T11:06:55+00:00',
   'elapsed_hours': None,
   'elapsed_seconds': 0.0,
   'ram_megabyte_hours': None,
   'cp

Get the execution output

In [20]:
output = execution.get_output()
output

copy /calrissian/output.json to .
STDERR: tar: removing leading '/' from member names



{'stac': {'location': 'file:///calrissian/qk45d6lp',
  'basename': 'qk45d6lp',
  'class': 'Directory',
  'listing': [{'class': 'File',
    'location': 'file:///calrissian/qk45d6lp/dnbr-item.json',
    'basename': 'dnbr-item.json',
    'checksum': 'sha1$1c0a635ad501c599ab258019d05c7b276515c565',
    'size': 818,
    'path': '/calrissian/qk45d6lp/dnbr-item.json'},
   {'class': 'File',
    'location': 'file:///calrissian/qk45d6lp/catalog.json',
    'basename': 'catalog.json',
    'checksum': 'sha1$a5d1d9821e889aa125778e4f2e14a788ff1512ce',
    'size': 225,
    'path': '/calrissian/qk45d6lp/catalog.json'},
   {'class': 'File',
    'location': 'file:///calrissian/qk45d6lp/dnbr.tif',
    'basename': 'dnbr.tif',
    'checksum': 'sha1$87a3dfee0d055453dad525e8edd8a216121d808c',
    'size': 1402218,
    'path': '/calrissian/qk45d6lp/dnbr.tif'}],
  'path': '/calrissian/qk45d6lp'}}

Get a few details about the execution

In [21]:
print(execution.get_start_time())
print(execution.get_completion_time())

2023-03-22 11:05:49+00:00
2023-03-22 11:08:20+00:00


In [22]:
print(f"complete {execution.is_complete()}")
print(f"succeeded {execution.is_succeeded()}")

complete True
succeeded True


In [23]:
execution.get_tool_logs()

copy /calrissian/report.json to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac_2.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac_3.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac_4.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac_6.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_stac_5.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_subset.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_subset_2.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_subset_3.log to .
STDERR: tar: removing leading '/' from member names

copy /calrissian/node_subset_4.log to .
STDERR: tar: removing leading '/' from member names


['./node_stac.log',
 './node_stac_2.log',
 './node_stac_3.log',
 './node_stac_4.log',
 './node_stac_6.log',
 './node_stac_5.log',
 './node_subset.log',
 './node_subset_2.log',
 './node_subset_3.log',
 './node_subset_4.log',
 './node_subset_6.log',
 './node_subset_5.log',
 './node_nbr.log',
 './node_nbr_2.log',
 './node_cog.log',
 './node_cog_2.log',
 './node_dnbr.log',
 './node_cog_3.log',
 './node_stac_7.log']

Delete the Kubernetes namespace with:

In [17]:
session.dispose()

2023-03-22 11:03:48.238 | INFO     | pycalrissian.context:dispose:121 - delete pod job-167948062876755-9399351e-e736-47f6-830d-89f9432b6fd4-h6brf
2023-03-22 11:03:48.313 | INFO     | pycalrissian.context:dispose:121 - delete pod job-1679481953421835-58349bec-3178-45b6-827e-37ee74804919-44xdc
2023-03-22 11:03:48.380 | INFO     | pycalrissian.context:dispose:121 - delete pod job-1679482405334652-5e620474-376e-413f-99ed-f40ca19ba08a-ftg9w
2023-03-22 11:03:48.478 | INFO     | pycalrissian.context:dispose:121 - delete pod job-1679482833004388-941219c5-9755-46ce-9ebf-917dfda774a4-4fnd9
2023-03-22 11:03:48.810 | INFO     | pycalrissian.context:dispose:121 - delete pod job-1679482833004388-941219c5-9755-46ce-9ebf-917dfda774a4-5kfdw
2023-03-22 11:03:49.228 | INFO     | pycalrissian.context:dispose:121 - delete pod job-1679482833004388-941219c5-9755-46ce-9ebf-917dfda774a4-x889h
2023-03-22 11:03:49.323 | INFO     | pycalrissian.context:dispose:124 - dispose namespace job-namespace
2023-03-22 11:0

{'api_version': 'v1',
 'code': None,
 'details': None,
 'kind': 'Namespace',
 'message': None,
 'metadata': {'_continue': None,
              'remaining_item_count': None,
              'resource_version': '18983565861',
              'self_link': None},
 'reason': None,
 'status': "{'phase': 'Terminating'}"}