# Convert-search

Invoke a Common Workflow Language CommandLineTool bash script to invoke `yq` to:
- extract the discovered STAC Items `self` link `href` that is used as an input in the water bodies detection sub-workflow
- extract the aoi bounding box from the `search_request` input parameter

This notebook is linked to: https://eoap.github.io/zarr-cloud-native-format/cwl-cli/convert-search/



## Setup

In [8]:
export WORKSPACE=/workspace/zarr-cloud-native-format
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}

curl -q -L https://github.com/eoap/zarr-cloud-native-format/releases/download/0.3.0/app-water-bodies.0.3.0.cwl > ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl 2> /dev/null

## Run the convert-seach step

Inspect and use `cwltool` to run the CommandLineTool definition:


In [9]:
cat ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl | yq e '.["$graph"][1]' -

[36mclass[0m:[32m CommandLineTool[0m
[32m[0m[36mid[0m:[32m convert-search[0m
[32m[0m[36mlabel[0m:[32m Gets the item self hrefs[0m
[32m[0m[36mdoc[0m:[32m Gets the item self hrefs from a STAC search result[0m
[32m[0m[36mbaseCommand[0m: [[32m"/bin/sh"[0m,[32m "run.sh"[0m][36m[0m
[36marguments[0m: [][36m[0m
[36mhints[0m:[36m[0m
[36m  DockerRequirement[0m:[36m[0m
[36m    dockerPull[0m:[32m ghcr.io/eoap/zarr-cloud-native-format/yq@sha256:401655f3f4041bf3d03b05f3b24ad4b9d18cfcf908c3b44f5901383621d0688a[0m
[32m[0m[36mrequirements[0m:
  -[36m class[0m:[32m InlineJavascriptRequirement[0m
[32m  [0m-[36m class[0m:[32m SchemaDefRequirement[0m
[32m    [0m[36mtypes[0m:
      -[36m $import[0m:[32m https://raw.githubusercontent.com/eoap/schemas/main/string_format.yaml[0m
[32m      [0m-[36m $import[0m:[32m https://raw.githubusercontent.com/eoap/schemas/main/geojson.yaml[0m
[32m      [0m-[36m $import[0m: |-
[32m          htt

The bash script to run is:

In [10]:
yq e '.["$graph"][1].requirements[]
      | select(.class == "InitialWorkDirRequirement")
      | .listing[0].entry' \
  "${WORKSPACE}/cwl-workflow/app-water-bodies.cwl"

#!/usr/bin/env sh
set -x
set -euo pipefail

yq '[.features[].links[] | select(.rel=="self") | .href]' "$(inputs.search_results.path)" > items.json

echo "$(inputs.search_request)" | yq '.bbox | @csv' - > aoi.txt


Run the CWL description:

In [12]:
cat <<EOF > convert-search-params.yaml
search_request:
  bbox:
  - -121.399
  - 39.834
  - -120.74
  - 40.472
  collections:
  - sentinel-2-l2a
  datetime_interval:
    end:
      value: '2021-08-01T23:59:59'
    start:
      value: '2021-06-01T00:00:00'
  limit: 20
  max-items: 10

search_results:
  class: File
  path: "${WORKSPACE}/runs/discovery-output.json"
EOF

cat convert-search-params.yaml | yq .

[36msearch_request[0m:[36m[0m
[36m  bbox[0m:
    -[95m -121.399[0m
[95m    [0m-[95m 39.834[0m
[95m    [0m-[95m -120.74[0m
[95m    [0m-[95m 40.472[0m
[95m  [0m[36mcollections[0m:
    -[32m sentinel-2-l2a[0m
[32m  [0m[36mdatetime_interval[0m:[36m[0m
[36m    end[0m:[36m[0m
[36m      value[0m:[32m '2021-08-01T23:59:59'[0m[36m[0m
[36m    start[0m:[36m[0m
[36m      value[0m:[32m '2021-06-01T00:00:00'[0m[36m[0m
[36m  limit[0m:[95m 20[0m
[95m  [0m[36mmax-items[0m:[95m 10[0m
[95m[0m[36msearch_results[0m:[36m[0m
[36m  class[0m:[32m File[0m
[32m  [0m[36mpath[0m:[32m "/workspace/zarr-cloud-native-format/runs/discovery-output.json"[0m


In [14]:


cwltool \
    --podman \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-workflow/app-water-bodies.cwl#convert-search \
    convert-search-params.yaml > convert-search-results.json 2> convert-search.log

Let's look at the content of the stderr:

In [15]:
cat convert-search.log | egrep -v "WARNING|JSHINT"

[1;30mINFO[0m /home/fbrito/.local/bin/cwltool 3.1.20250110105449
[1;30mINFO[0m Resolved '/workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#convert-search' to 'file:///workspace/zarr-cloud-native-format/cwl-workflow/app-water-bodies.cwl#convert-search'
[1;30mINFO[0m [job convert-search] /tmp/z4_3c9ik$ podman \
    run \
    -i \
    --userns=keep-id \
    --mount=type=bind,source=/tmp/z4_3c9ik,target=/xSpKRG \
    --mount=type=bind,source=/tmp/lasrxu0p,target=/tmp \
    --mount=type=bind,source=/workspace/zarr-cloud-native-format/runs/discovery-output.json,target=/var/lib/cwl/stg68b29ab9-0a96-4f10-8ce9-3b59912147e3/discovery-output.json,readonly \
    --workdir=/xSpKRG \
    --read-only=true \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/xwp01rr9/20250915142619-690236.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/xSpKRG \
    ghcr.io/eoap/zarr-cloud-native-format/yq@sha256:401655f3f4041bf3d03b05f3b24ad4b9d18cfcf908c3b44f5901383621d0688a \
    /bin/sh \


Let's inspect the stdout produced. There are two `output` blocks with the id `aoi` and `items`. 

These are what the water bodies detection sub-workflow expects as inputs

In [16]:
cat convert-search-results.json | jq . -

[1;39m{
  [0m[1;34m"aoi"[0m[1;39m: [0m[0;32m"-121.399,39.834,-120.74,40.472"[0m[1;39m,
  [0m[1;34m"items"[0m[1;39m: [0m[1;39m[
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_0_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210728_1_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_10TFK_20210723_1_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_10TFK_20210723_0_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_10TFK_20210718_0_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_10TFK_20210713_1_L2A"[0m[1;39m,
    [0;32m"https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2B_10TFK_20210713_0_