# STAC Zarr 

Invoke a Common Workflow Language CommandLineTool to generate the STAC Zarr

This notebook is linked to: https://eoap.github.io/zarr-cloud-native-format/cwl-cli/stac-zarr



## Setup

In [1]:
export WORKSPACE=/workspace/zarr-cloud-native-format
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

curl -q -L https://github.com/eoap/zarr-cloud-native-format/releases/download/0.5.0/app-water-bodies.0.5.0.cwl > ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl 2> /dev/null

## Run the STAC Zarr generation CommandLineTool

Inspect and use `cwltool` to run the STAC Zarr generation definition:


In [2]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl | yq e '.["$graph"][7]' -

[36mclass[0m:[32m CommandLineTool[0m
[32m[0m[36mid[0m:[32m stac-zarr[0m
[32m[0m[36mrequirements[0m:[36m[0m
[36m  InlineJavascriptRequirement[0m: {}[36m[0m
[36m  ResourceRequirement[0m:[36m[0m
[36m    coresMax[0m:[95m 1[0m
[95m    [0m[36mramMax[0m:[95m 512[0m
[95m[0m[36mhints[0m:[36m[0m
[36m  DockerRequirement[0m:[36m[0m
[36m    dockerPull[0m:[32m ghcr.io/eoap/zarr-cloud-native-format/stac-zarr@sha256:459a06153db8fffb2fc5b672bb9caf96ba689c7e731000c65e9ba047b0644a69[0m
[32m[0m[36mbaseCommand[0m: [[32m"stac-zarr"[0m][36m[0m
[36marguments[0m: [][36m[0m
[36minputs[0m:[36m[0m
[36m  stac_catalog[0m:[36m[0m
[36m    type[0m:[32m Directory[0m
[32m    [0m[36minputBinding[0m:[36m[0m
[36m      prefix[0m:[32m --stac-catalog[0m
[32m[0m[36moutputs[0m:[36m[0m
[36m  zarr_stac_catalog[0m:[36m[0m
[36m    outputBinding[0m:[36m[0m
[36m      glob[0m:[32m .[0m
[32m    [0m[36mtype[0m:[32m Directory[0m


Run the CWL description, but first prepare the parameters.

The previous step generated the water bodies detection geotif:

In [3]:
cat stac-generation-results.json 

{
    "temp_stac_catalog": {
        "location": "file:///workspace/zarr-cloud-native-format/runs/hicmgmcc",
        "basename": "hicmgmcc",
        "class": "Directory",
        "listing": [
            {
                "class": "Directory",
                "location": "file:///workspace/zarr-cloud-native-format/runs/hicmgmcc/water-bodies",
                "basename": "water-bodies",
                "listing": [
                    {
                        "class": "File",
                        "location": "file:///workspace/zarr-cloud-native-format/runs/hicmgmcc/water-bodies/collection.json",
                        "basename": "collection.json",
                        "size": 1351,
                        "checksum": "sha1$299a72c299d255d14b7655e76824d9119a72cc6e",
                        "path": "/workspace/zarr-cloud-native-format/runs/hicmgmcc/water-bodies/collection.json"
                    },
                    {
                        "class": "Directory",
            

Let's build the job parameters file

In [4]:
cat <<EOF > stac-zarr-generation-params.yaml
stac_catalog: 
  class: Directory
  path: $( cat stac-generation-results.json | jq -r '.temp_stac_catalog.path' )
EOF

cat stac-zarr-generation-params.yaml | yq .

[36mstac_catalog[0m:[36m[0m
[36m  class[0m:[32m Directory[0m
[32m  [0m[36mpath[0m:[32m /workspace/zarr-cloud-native-format/runs/hicmgmcc[0m


In [5]:


cwltool \
    --podman \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl#stac-zarr \
    stac-zarr-generation-params.yaml > stac-zarr-generation-results.json 2> stac-zarr-generation.log

Let's look at the content of the stderr:

In [6]:
cat stac-zarr-generation.log | egrep -v "WARNING|JSHINT"

[1;30mINFO[0m /home/fbrito/.local/bin/cwltool 3.1.20250110105449
[1;30mINFO[0m Resolved '/workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac-zarr' to 'file:///workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac-zarr'
Error: no such object: "ghcr.io/eoap/zarr-cloud-native-format/stac-zarr@sha256:459a06153db8fffb2fc5b672bb9caf96ba689c7e731000c65e9ba047b0644a69"
[1;30mINFO[0m ['podman', 'pull', 'ghcr.io/eoap/zarr-cloud-native-format/stac-zarr@sha256:459a06153db8fffb2fc5b672bb9caf96ba689c7e731000c65e9ba047b0644a69']
Trying to pull ghcr.io/eoap/zarr-cloud-native-format/stac-zarr@sha256:459a06153db8fffb2fc5b672bb9caf96ba689c7e731000c65e9ba047b0644a69...
Getting image source signatures
Copying blob sha256:1d93c12cf2b0c0ba2b2892f57feb93e5c351daba20fb5f23f83cb3e2b9019630
Copying blob sha256:8ec988941d6694de13ed8cb1505c0eb38bf3777bab0acc157ff18974d7350470
Copying blob sha256:ce6203c8c201f1d0c673adf51ba833952b0f23bcb3a27022803769de34192c15
Copyin

Let's inspect the stdout produced. 

In [7]:
cat stac-zarr-generation-results.json | jq . -

[1;39m{
  [0m[1;34m"zarr_stac_catalog"[0m[1;39m: [0m[1;39m{
    [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/lq_o_dwl"[0m[1;39m,
    [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"lq_o_dwl"[0m[1;39m,
    [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
    [0m[1;34m"listing"[0m[1;39m: [0m[1;39m[
      [1;39m{
        [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
        [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/lq_o_dwl/water-bodies"[0m[1;39m,
        [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"water-bodies"[0m[1;39m,
        [0m[1;34m"listing"[0m[1;39m: [0m[1;39m[
          [1;39m{
            [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
            [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/lq_o_dwl/water-bodies/water-bodies.zarr"[0m[1;39m,
     