# STAC Catalog 

Invoke a Common Workflow Language CommandLineTool to generate the STAC Catalog

This notebook is linked to: https://eoap.github.io/zarr-cloud-native-format/cwl-cli/stac



## Setup

In [1]:
export WORKSPACE=/workspace/zarr-cloud-native-format
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

curl -q -L https://github.com/eoap/zarr-cloud-native-format/releases/download/0.3.0/app-water-bodies.0.3.0.cwl > ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl 2> /dev/null

## Run the STAC Catalog generation CommandLineTool

Inspect and use `cwltool` to run the STAC Catalog generation definition:


In [2]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl | yq e '.["$graph"][6]' -

[36mclass[0m:[32m CommandLineTool[0m
[32m[0m[36mid[0m:[32m stac[0m
[32m[0m[36mrequirements[0m:[36m[0m
[36m  InlineJavascriptRequirement[0m: {}[36m[0m
[36m  EnvVarRequirement[0m:[36m[0m
[36m    envDef[0m:[36m[0m
[36m      PATH[0m:[32m /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin[0m
[32m      [0m[36mPYTHONPATH[0m:[32m /app[0m
[32m  [0m[36mResourceRequirement[0m:[36m[0m
[36m    coresMax[0m:[95m 1[0m
[95m    [0m[36mramMax[0m:[95m 512[0m
[95m[0m[36mhints[0m:[36m[0m
[36m  DockerRequirement[0m:[36m[0m
[36m    dockerPull[0m:[32m ghcr.io/eoap/mastering-app-package/stac@sha256:cb54ab738a7d6544f2037368604c1c40e3f5af3eac1250ed14d365e5acb0c6b5[0m
[32m[0m[36mbaseCommand[0m: [[32m"python"[0m,[32m "-m"[0m,[32m "app"[0m][36m[0m
[36marguments[0m: [][36m[0m
[36minputs[0m:[36m[0m
[36m  item[0m:[36m[0m
[36m    type[0m:[36m[0m
[36m      type[0m:[32m array[0m
[32m      [0m[36mitems[0m:[32m

Run the CWL description, but first prepare the parameters.

The previous step generated the water bodies detection geotif:

In [3]:
cat water-bodies-results.json 

{
    "detected_water_body": {
        "location": "file:///workspace/zarr-cloud-native-format/runs/otsu.tif",
        "basename": "otsu.tif",
        "class": "File",
        "checksum": "sha1$a7f9a22f096d7bb5e2a0ecfe0abba2eab2350f1f",
        "size": 1100785,
        "path": "/workspace/zarr-cloud-native-format/runs/otsu.tif"
    }
}


Let's build the job parameters file with the otsu.tif file and the associated STAC Item input

In [10]:
ostu_tif=$(cat water-bodies-results.json  | jq '.detected_water_body.path')

echo ${ostu_tif}

"/workspace/zarr-cloud-native-format/runs/otsu.tif"


In [9]:
item=$( cat convert-search-results.json | jq '.items[0]' )

echo ${item}

"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_0_L2A"


In [14]:
cat <<EOF > stac-generation-params.yaml
item: 
- ${item}
rasters:
- class: File
  path: ${ostu_tif}
EOF

cat stac-generation-params.yaml | yq .

[36mitem[0m:
  -[32m "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_0_L2A"[0m[36m[0m
[36mrasters[0m:
  -[36m class[0m:[32m File[0m
[32m    [0m[36mpath[0m:[32m "/workspace/zarr-cloud-native-format/runs/otsu.tif"[0m


In [15]:


cwltool \
    --podman \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl#stac \
    stac-generation-params.yaml > stac-generation-results.json 2> stac-generation.log

Let's look at the content of the stderr:

In [16]:
cat stac-generation.log | egrep -v "WARNING|JSHINT"

[1;30mINFO[0m /home/fbrito/.local/bin/cwltool 3.1.20250110105449
[1;30mINFO[0m Resolved '/workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac' to 'file:///workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#stac'
[1;30mINFO[0m [job stac] /tmp/giqo2xko$ podman \
    run \
    -i \
    --userns=keep-id \
    --mount=type=bind,source=/tmp/giqo2xko,target=/RhFOvY \
    --mount=type=bind,source=/tmp/p0mjuywr,target=/tmp \
    --mount=type=bind,source=/workspace/zarr-cloud-native-format/runs/otsu.tif,target=/var/lib/cwl/stg026e359f-2aa4-4147-8064-d8afa89091ca/otsu.tif,readonly \
    --workdir=/RhFOvY \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/7jv6tx7d/20250915144715-676009.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/RhFOvY \
    --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
    --env=PYTHONPATH=/app \
    ghcr.io/eoap/mastering-app-package/stac@sha256:cb54ab738a7d6544f2037368604c1c40

Let's inspect the stdout produced. There are two `output` blocks with the id `aoi` and `items`. 

These are what the water bodies detection sub-workflow expects as inputs

In [17]:
cat stac-generation-results.json | jq . -

[1;39m{
  [0m[1;34m"temp_stac_catalog"[0m[1;39m: [0m[1;39m{
    [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/giqo2xko"[0m[1;39m,
    [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"giqo2xko"[0m[1;39m,
    [0m[1;34m"class"[0m[1;39m: [0m[0;32m"Directory"[0m[1;39m,
    [0m[1;34m"listing"[0m[1;39m: [0m[1;39m[
      [1;39m{
        [0m[1;34m"class"[0m[1;39m: [0m[0;32m"File"[0m[1;39m,
        [0m[1;34m"location"[0m[1;39m: [0m[0;32m"file:///workspace/zarr-cloud-native-format/runs/giqo2xko/catalog.json"[0m[1;39m,
        [0m[1;34m"basename"[0m[1;39m: [0m[0;32m"catalog.json"[0m[1;39m,
        [0m[1;34m"size"[0m[1;39m: [0m[0;39m363[0m[1;39m,
        [0m[1;34m"checksum"[0m[1;39m: [0m[0;32m"sha1$60975eb0efd2cb8615bfa4b181774ea961ee7d54"[0m[1;39m,
        [0m[1;34m"path"[0m[1;39m: [0m[0;32m"/workspace/zarr-cloud-native-format/runs/giqo2xko/catalog.json"[0m[1;39m
      [1;39m}[0