# Demonstrator 2023-ecp-f2f

Refer to `README.md` for context.

## 0. Configuration

Let's start off by sourcing secrets and defining the locations of the various required services:

In [None]:
# source secrets
source .env

# define TES instances
unset TES_INSTANCES
declare -A TES_INSTANCES
TES_INSTANCES["Kubernetes @ ELIXIR-CZ"]="https://tesk-na.cloud.e-infra.cz/"
TES_INSTANCES["Kubernetes @ ELIXIR-FI"]="https://csc-tesk-noauth.rahtiapp.fi/"
TES_INSTANCES["Kubernetes @ ELIXIR-GR"]="https://tesk-eu.hypatia-comp.athenarc.gr/"
TES_INSTANCES["OpenPBS @ ELIXIR-CZ"]="https://funnel.cloud.e-infra.cz/"
TES_INSTANCES["Slurm @ ELIXIR-FI"]="https://vm4816.kaj.pouta.csc.fi/"

# define storage instance
export FTP_INSTANCE="ftp://ntc.ics.muni.cz/"

## 1. Executing tasks via the GA4GH TES API

In this section, we will use both the shell and a dedicated Python library to
send tasks to the defined TES instances.

### Using the shell

Here, we will use the `curl` library to send requests to the TES APIs. It
should be easy to adapt the calls for use with other tools, such as Postman or
your favorite progamming language's HTTP request libraries.

#### **List TES instances**

Let's see what TES instances we have defined:

In [None]:
for key in "${!TES_INSTANCES[@]}"; do
    export TES="${TES_INSTANCES[$key]}"
    echo "$key: $TES"
done

Let's also export these key-value pairs for later use in Python:

In [None]:
rm -f .tes_instances
for key in "${!TES_INSTANCES[@]}"; do
  printf '%s\0' "$key" "${TES_INSTANCES[$key]}"
done > .tes_instances

#### **Running a minimal task**

Now we will submit a very simple task to each of these instances. The task we
use here defines no inputs and outputs, so we do not need to read from or write
to any storage instances.

The payload for the task needs to be provided in JSON format. Nicely formatted,
it looks like this:

```json
{
  "executors": [
    {
      "image": "alpine",
      "command": [
        "echo",
        "hello"
      ]
    }
  ]
}
```

With these instructions, we are asking the TES instance to execute the command
`echo hello` in (the default version of) an Alpine Linux container.

Let's define the payload to a variable:

In [None]:
export PAYLOAD='{"executors":[{"image":"alpine","command":["echo","hello"]}]}'

Now we are ready to submit the tasks:

In [None]:
unset TASKS
declare -A TASKS
for key in "${!TES_INSTANCES[@]}"; do
    export TES="${TES_INSTANCES[$key]}"
    echo "Submitting task to $key ($TES)..."
    TASK_ID=$( \
        curl \
            --silent \
            --request "POST" \
            --header "accept: application/json" \
            --header "Content-Type: application/json" \
            --user "${FUNNEL_SERVER_USER}:${FUNNEL_SERVER_PASSWORD}" \
            --data "$PAYLOAD" \
            "${TES%/}/v1/tasks" | \
        jq ".id" - | \
        tr -d '"'
    )
    if [ $TASK_ID == "null" ]; then
        echo "FAILED"
    else
        echo "Task ID: $TASK_ID"
        TASKS["$TASK_ID"]="$TES"
    fi
done

Let's see how the execution of successfully submitted tasks is progressing:

In [None]:
for TASK_ID in "${!TASKS[@]}"; do
    export TES="${TASKS[$TASK_ID]}"
    echo -"Checking state of task '$TASK_ID' ($TES)..."
    TASK_STATE=$( \
        curl \
            --silent \
            --request "GET" \
            --header "accept: application/json" \
            --header "Content-Type: application/json" \
            --user "${FUNNEL_SERVER_USER}:${FUNNEL_SERVER_PASSWORD}" \
            "${TES%/}/v1/tasks/${TASK_ID}" | \
        tee /dev/null | \
        jq ".state" - | \
        tr -d '"'
    )
    echo "Task state: $TASK_STATE"
done

#### **Running a task with inputs and outputs**

Let's try a little more realistic task with an input (from the web) and an
output (written to an FTP instance).

We define the following payload:

```json
{
  "name": "md5sum",
  "description": "calculate md5sum of input file and write to output file",
  "tags": {
    "project": "2023-ecp-f2f Demonstrator",
    "project_owner": "ELIXIR Cloud & AAI"
  },
  "executors": [
    {
      "command": [
        "md5sum",
        "/data/input"
      ],
      "image": "alpine",
      "stdout": "/data/output",
      "workdir": "/data"
    }
  ],
  "inputs": [
    {
      "url": "{{INPUT_FILE}}",
      "path": "/data/input"
    }
  ],
  "outputs": [
    {
      "path": "/data/output",
      "url": "{{FTP_INSTANCE}}/2023-ecp-f2f/md5sum",
      "type": "FILE"
    }
  ],
  "resources": {
    "cpu_cores": 1,
    "disk_gb": 1,
    "preemptible": false,
    "ram_gb": 1
  }
}
```

As you can see, here we determine the MD5 sum of an input file, write it to an
output file inside the container, and finally copy it over to an FTP server.

Let's minify that and replace the placeholders `{{INPUT_FILE}}` and
`{{FTP_INSTANCE}}` with some actual values.

> Note that because Funnel does currently only allow [passing FTP storage
> credentials via the FTP
> URL](https://ohsu-comp-bio.github.io/funnel/docs/storage/ftp/) and TESK does
> not support FTP URLs with credentials, we need to use different payloads for
> the two services!

In [None]:
PAYLOAD_RAW='{"name":"md5sum","description":"calculate md5sum of input file and write to output file","tags":{"project":"2023-ecp-f2f Demonstrator","project_owner":"ELIXIR Cloud & AAI"},"executors":[{"command":["md5sum","/data/input"],"image":"alpine","stdout":"/data/output","workdir":"/data"}],"inputs":[{"url":"{{INPUT_FILE}}","path":"/data/input"}],"outputs":[{"path":"/data/output","url":"{{FTP_INSTANCE}}/2023-ecp-f2f/md5sum","type":"FILE"}],"resources":{"cpu_cores":1,"disk_gb":1,"preemptible":false,"ram_gb":1}}'
PAYLOAD_TMP=$(sed 's#{{INPUT_FILE}}#https://raw.githubusercontent.com/elixir-cloud-aai/elixir-cloud-demos/df5be391faf992ebcd5ec2b2aad581c99de26101/LICENSE#' <<< $PAYLOAD_RAW)
export PAYLOAD_TESK=$(sed "s|{{FTP_INSTANCE}}|${FTP_INSTANCE%/}|" <<< $PAYLOAD_TMP)
export PAYLOAD_FUNNEL=$(sed "s|ftp://|ftp://${FTP_USER}:${FTP_PASSWORD}@|g" <<< $PAYLOAD_TESK)

Let's submit as before (but setting the payload according to the service):

In [None]:
unset TASKS
declare -A TASKS
for key in "${!TES_INSTANCES[@]}"; do
    export TES="${TES_INSTANCES[$key]}"
    if [[ $key =~ "Kubernetes" ]]; then
        export PAYLOAD="$PAYLOAD_TESK"
        export SERVICE="TESK"
    else
        export PAYLOAD="$PAYLOAD_FUNNEL"
        export SERVICE="Funnel"
    fi
    echo "Submitting task to $key ($SERVICE deployed at $TES)..."
    TASK_ID=$( \
        curl \
            --silent \
            --request "POST" \
            --header "accept: application/json" \
            --header "Content-Type: application/json" \
            --user "${FUNNEL_SERVER_USER}:${FUNNEL_SERVER_PASSWORD}" \
            --data "$PAYLOAD" \
            "${TES%/}/v1/tasks" | \
        jq ".id" - | \
        tr -d '"'
    )
    if [ $TASK_ID == "null" ]; then
        echo "FAILED"
    else
        echo "Task ID: $TASK_ID"
        TASKS["$TASK_ID"]="$TES"
    fi
done

And check the states, but with a lot more detail:

In [None]:
export VIEW=BASIC
for TASK_ID in "${!TASKS[@]}"; do
    export TES="${TASKS[$TASK_ID]}"
    echo "================================================================================"
    echo "Logs of task '$TASK_ID' ($TES): "
    echo "================================================================================"
    TASK_STATE=$( \
        curl \
            --silent \
            --request "GET" \
            --header "accept: application/json" \
            --header "Content-Type: application/json" \
            --user "${FUNNEL_SERVER_USER}:${FUNNEL_SERVER_PASSWORD}" \
            "${TES%/}/v1/tasks/${TASK_ID}?view=${VIEW}" | \
        jq . - | \
        sed "s/${FTP_USER}:${FTP_PASSWORD}@//g"  # remove FTP credentials from logs
    )
    echo -e "$TASK_STATE\n"
done

#### **Other TES operations**

We have seen how we can submit tasks and get summary or detailed information on
individual tasks.

The full list of currently supported operations is:

| HTTP Method | Endpoint | Description |
| --- | --- | --- |
| GET | `/service-info` | Fetch information about the service and its optional capabilities |
| POST | `/tasks` | Create a task |
| GET | `/tasks` | Fetch a list of all tasks |
| GET | `/tasks/{task_id}` | Fetch details about a specific task |
| GET | `/tasks/{task_id}:cancel` | Cancel a task |

### Using the `py-tes` Python library

In this section, we are submitting the simple task from above using the Python
TES client `py-tes`.

Usage is simple (from the [`py-tes` repository](https://github.com/ohsu-comp-bio/py-tes)):

```python
import tes

task = tes.Task(
    executors=[
        tes.Executor(
            image="alpine",
            command=["echo", "hello"]
        )
    ]
)

cli = tes.HTTPClient("http://funnel.example.com", timeout=5)
task_id = cli.create_task(task)
res = cli.get_task(task_id)
```

To do so, we will execute Python script `task_submission.py`, which triggers
the execution of our task on the TES instances configured at the top of the
notebook.

In [None]:
./task_submission.py