# STAC Catalog 

Invoke a Common Workflow Language CommandLineTool to generate the STAC Catalog

This notebook is linked to: https://eoap.github.io/zarr-cloud-native-format/cwl-cli/stac



## Setup

In [16]:
export WORKSPACE=/workspace/zarr-cloud-native-format
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

curl -q -L https://github.com/eoap/zarr-cloud-native-format/releases/download/0.5.0/app-water-bodies.0.5.0.cwl > ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl 2> /dev/null

## Run the STAC Catalog generation CommandLineTool

Inspect and use `cwltool` to run the STAC Catalog generation definition:


In [17]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl | yq e '.["$graph"][6]' -

[36mclass[0m:[32m CommandLineTool[0m
[32m[0m[36mid[0m:[32m stac-collection[0m
[32m[0m[36mrequirements[0m:[36m[0m
[36m  InlineJavascriptRequirement[0m: {}[36m[0m
[36m  ResourceRequirement[0m:[36m[0m
[36m    coresMax[0m:[95m 1[0m
[95m    [0m[36mramMax[0m:[95m 512[0m
[95m[0m[36mhints[0m:[36m[0m
[36m  DockerRequirement[0m:[36m[0m
[36m    dockerPull[0m:[32m ghcr.io/eoap/zarr-cloud-native-format/stac-collection@sha256:a6ab00463338b4069eb57b3944d92cf39fd2307176ad5b5cb357836c56a22ca2[0m
[32m[0m[36mbaseCommand[0m: [[32m"stac-collection"[0m][36m[0m
[36marguments[0m: [][36m[0m
[36minputs[0m:[36m[0m
[36m  item[0m:[36m[0m
[36m    type[0m:[36m[0m
[36m      type[0m:[32m array[0m
[32m      [0m[36mitems[0m:[32m string[0m
[32m      [0m[36minputBinding[0m:[36m[0m
[36m        prefix[0m:[32m --input-item[0m
[32m  [0m[36mrasters[0m:[36m[0m
[36m    type[0m:[36m[0m
[36m      type[0m:[32m array[0m
[32m   

Run the CWL description, but first prepare the parameters.

The previous step generated the water bodies detection geotif:

In [18]:
cat water-bodies-results.json 

{
    "detected_water_body": {
        "location": "file:///workspace/zarr-cloud-native-format/runs/otsu.tif",
        "basename": "otsu.tif",
        "class": "File",
        "checksum": "sha1$a7f9a22f096d7bb5e2a0ecfe0abba2eab2350f1f",
        "size": 1100785,
        "path": "/workspace/zarr-cloud-native-format/runs/otsu.tif"
    },
    "ndwi": {
        "location": "file:///workspace/zarr-cloud-native-format/runs/norm_diff.tif",
        "basename": "norm_diff.tif",
        "class": "File",
        "checksum": "sha1$9e77047b2de97a3c2bf78e18e29867463c418c53",
        "size": 233064610,
        "path": "/workspace/zarr-cloud-native-format/runs/norm_diff.tif"
    }
}


Let's build the job parameters file with the otsu.tif file and the associated STAC Item input

In [26]:
ostu_tif=$(cat water-bodies-results.json  | jq '.detected_water_body.path')
ndwi_tif=$(cat water-bodies-results.json  | jq '.ndwi.path')
echo ${ostu_tif}
echo ${ndwi_tif}

"/workspace/zarr-cloud-native-format/runs/otsu.tif"
"/workspace/zarr-cloud-native-format/runs/norm_diff.tif"


In [20]:
item=$( cat convert-search-results.json | jq '.items[0]' )

echo ${item}

"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_0_L2A"


In [27]:
cat <<EOF > stac-generation-params.yaml
item: 
- ${item}
rasters:
- class: File
  path: ${ostu_tif}
ndwis:
- class: File
  path: ${ndwi_tif}
EOF

cat stac-generation-params.yaml | yq .

[36mitem[0m:
  -[32m "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_0_L2A"[0m[36m[0m
[36mrasters[0m:
  -[36m class[0m:[32m File[0m
[32m    [0m[36mpath[0m:[32m "/workspace/zarr-cloud-native-format/runs/otsu.tif"[0m[36m[0m
[36mndwis[0m:
  -[36m class[0m:[32m File[0m
[32m    [0m[36mpath[0m:[32m "/workspace/zarr-cloud-native-format/runs/norm_diff.tif"[0m


In [29]:


cwltool \
    --podman \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl#stac-collection \
    stac-generation-params.yaml > stac-generation-results.json 2> stac-generation.log

Let's look at the content of the stderr:

In [None]:
cat stac-generation.log | egrep -v "WARNING|JSHINT"

[1;30mINFO[0m /home/fbrito/.local/bin/cwltool 3.1.20250110105449
[1;30mINFO[0m Resolved '/workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac' to 'file:///workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac'
[1;30mINFO[0m [job stac] /tmp/m9tf6uoo$ podman \
    run \
    -i \
    --userns=keep-id \
    --mount=type=bind,source=/tmp/m9tf6uoo,target=/oAmJOc \
    --mount=type=bind,source=/tmp/9est9nzk,target=/tmp \
    --mount=type=bind,source=/workspace/zarr-cloud-native-format/runs/otsu.tif,target=/var/lib/cwl/stg0dfb665e-3b44-48b1-943e-f38b4ff6d03c/otsu.tif,readonly \
    --workdir=/oAmJOc \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/qowl65pu/20260114122904-265269.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/oAmJOc \
    --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
    --env=PYTHONPATH=/app \
    stac:bb \
    python \
    -m \
    app \
    --input-item \
    https://earth-

Let's inspect the stdout produced. There are two `output` blocks with the id `aoi` and `items`. 

These are what the water bodies detection sub-workflow expects as inputs

In [30]:
cat stac-generation-results.json | jq . -

[1;39m{
  [0m[1;34m"temp_stac_catalog"[0m[1;39m: [0m[1;39m{
    [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/hicmgmcc"[0m[1;39m,
    [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"hicmgmcc"[0m[1;39m,
    [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
    [0m[1;34m"listing"[0m[1;39m: [0m[1;39m[
      [1;39m{
        [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
        [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/hicmgmcc/water-bodies"[0m[1;39m,
        [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"water-bodies"[0m[1;39m,
        [0m[1;34m"listing"[0m[1;39m: [0m[1;39m[
          [1;39m{
            [0m[1;34m"class"[0m[1;39m: [0m[0;32m"File"[0m[1;39m,
            [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/hicmgmcc/water-bodies/collection.json"[0m[1;39m,
            