## 1. Configuration

Refer to `README.md` for installation instructions.

Let's start off by sourcing secrets and defining the locations of the various required services:

In [None]:
# source secrets
source .env

## 2. List TES instances

Let's see what TES instances we have defined:

In [None]:
unset TES_INSTANCES
declare -A TES_INSTANCES
while IFS=',' read -r KEY URL; do
    TES_INSTANCES["$KEY"]=$URL
done < .tes_instances

for KEY in "${!TES_INSTANCES[@]}"; do
    echo "$KEY: ${TES_INSTANCES[$KEY]}"
done

## 3. Executing workflows via the TES network

Based off a [previous
demonstrator](https://github.com/elixir-cloud-aai/elixir-cloud-demos) showcasing the
[cwl-tes](https://github.com/ohsu-comp-bio/cwl-tes) workflow engine, we will demonstrate
how a workflow engine with a TES backend can execute a workflow across a network of
different TES instances.

In this demonstrator, we will use the
[Snakemake](https://github.com/snakemake/snakemake) workflow engine.

### 3.1 Running Snakemake workflows

We will use a simple workflow with a scatter and a gather step:

![workflow schema](images/wf-federated.svg)

The workflow will be executed once for each of our defined TES instances. In each case,
all workflow steps ("rules") will be executed on the same TES instance (TESK or Funnel).

Note that existing files will be overwritten.

In [None]:
export HOME=/tmp
for KEY in "${!TES_INSTANCES[@]}"; do
    TES="${TES_INSTANCES[$KEY]}"
    echo "Submitting task to $KEY ($TES)..."
    snakemake \
        --directory wf-federated \
        --snakefile wf-federated/Snakefile \
        --jobs 1 \
        --cores 1 \
        --tes "${TES%/}" \
        --forceall \
        --rerun-incomplete \
        --envvars HOME ACCESS_KEY_ID SECRET_ACCESS_KEY ENDPOINT_URL BUCKET_PATH
    echo "================================================================================"
done
echo "DONE"

### 3.2 Task federation via Snakemake and the proTES gateway

Now let's make it a bit more interesting by pointing Snakemake not to one of the TES
instances - but rather to an instance of the TES gateway
[proTES](https://github.com/elixir-cloud-aai/proTES). proTES accepts incoming TES
requests, applies one or more middlewares to the requests, then relays the incoming,
possibly modified requests onward to actual TES instances.

In our case, we make use of proTES to distribute the workloads associated with each of
the workflow steps across the network of TES instances in such a way that always the TES
instance that is physically closest to the input data is used to execute a given step.

The call schema for the workflow for a setup of five different TES instances across
three different locations is visualized in this schema:

![request and data flow](images/wf-federated_flow.svg)

Okay, let's go:

In [None]:
export HOME=/tmp
snakemake \
    --directory wf-federated \
    --snakefile wf-federated/Snakefile \
    --jobs 1 \
    --cores 1 \
    --tes "${TES_GATEWAY%/}"  \
    --forceall \
    --rerun-incomplete \
    --envvars HOME ACCESS_KEY_ID SECRET_ACCESS_KEY ENDPOINT_URL BUCKET_PATH
echo "================================================================================"
echo "DONE"