# SoS Syntax

## Terminology & Grammar

A SoS **script** defines one or more **workflows**, and each workflow consists of one or more **steps**. 

![workflow](../media/workflow.png)

A SoS script contains **comments**, **statements**, and one or more SoS **steps**. A SoS **step** consists of a **header**
with one or more step names and optional options. The body of a SoS step consists of optional **comments**, 
**statements**, **input**, **output**, **depends** directives, **parameter** definitions, followed by step **process**. 

More precisely,

* **Script**: A SoS script that defines one or more workflows.
* **section**: A group of statements with a header that defines one or more SoS steps.
* **Workflow**: A sequence of steps that can be executed to complete certain task.
* **Step**: A step of a workflow that perform one piece of the workflow.
* **Target**: Objects that are input and result of a SoS step, which are usually files, but can also be objects such as an executable command (with variable locations), and a SoS variable.
* **Step options**: Options of the step that assist the definition of the workflow.
* **Step input**: Specifies the input files of the step.
* **Step output**: Specifies the output files and targets of the step.
* **Step dependencies**: Specifies the files and targets that are required by the step.
* **Step process**: The process that a step executes to complete specified work, specified as one or more Python statements. 
* **Task**: Part or all step process that will be executed and monitored outside of SoS. These are usually resource intensive jobs that will take long time to complete.
* **Action**: SoS or user-defined Python functions. They differ from regular Python functions in that they may behave differently in different running mode of SoS (e.g. ignore when executed in dryrun mode).

More formally defined, the SoS syntax obeys the following grammar, given in extended Backus-Naur form (EBNF):

```
Script         = {comment}, {statement}, {step};
comment        = "#", text, NEWLINE
assignment     = name, "=", expression, NEWLINE
```

with SoS steps defined as

```
step           = step_header,
                 {comment}, {{statement}, [input | output | depends ]},
                 [process, NEWLINE, {script} ]
step_header    = "[", section_names, [":", names | options], "]", NEWLINE
parameter      = "parameter", ":", assignment
input          = "input", ":", [expressions], [",", options], NEWLINE
output         = "output", ":", [expressions], [",", options], NEWLINE
depends        = "depends", ":", [expressions], [",", options], NEWLINE
task           = "task", ":",  [options]
action         = func_format | script_format
func_format    = name, "(", [options], ")"
script_format  = name, ":", [options], NEWLINE, script 
section_names  = section_name, ",", section_name
section_name   = name, "(", text, ")"
names          = name, {",", name}
workflow       = name, ['_', steps], {"+", name, ['_', steps}
assignment     = name, "=", expression, NEWLINW
expressions    = expression, {",", expression}
options        = option, {"," option}
option         = name, "=", expression
```

Here `name`, `expression` and `statement` are arbitrary [Python 3](http://www.python.org) names, expression and statements with added SoS features.

## Native SoS file format (`.sos`)

A sos script can be defined in a plain text file. A `.sos` suffix is recommended but not required. A SoS script consists of **sections** that define **steps** of one or more **workflows**.

A SoS script usually starts with lines

```python
#!/usr/bin/env sos-runner
#fileformat=SOS1.0
```

The first line allows the script to be executed by command `sos-runner` if it is executed as an executable script. The second line tells SoS the format of the script. The `#fileformat` line does not have to be the first or second line but should be in the first comment block. The latest version of SOS format is assumed if no format line is present so it is a good practice to specify version of file format to make sure the script is interpreted correctly.

### Global section and default variables

A global section can be defined without section header in a `.sos` file as statements before any section, or as a regular section with header `[global]`. The global section is usually the first section in the script although it can be defined anywhere in the script if it contains a header.

Definitions in the global section are shared by all sections so it is usually used to define global variables and parameters. SoS implicitly defines the following variables in the global section:

* **`SOS_VERSION`**: version of SoS interpreter.
* **`CONFIG`**: A dictionary of configurations specified by the global sos configuration file (`~/.sos/config.yml`), host configuration file (`~/.sos/hosts.yml`), local configuration file (`./config.yml`) and a configuration file specified by command line option `-c`. The configuration files should be in [YAML format](http://www.yaml.org/start.html). Dictionaries defined in all these configuration files are merged to form a single dictionary `CONFIG`. Local and user-specified configurations override global configurations if a dictionary values conflict.

### SoS Sections

A SoS section is marked by a section header in the format of

```
[names: options]
```

The header should start with a `[` from the beginning of a line and end with a `]`. It can contain one or more names with optional description (for each step) and section options (for all steps).

Section names of a section follow the following rules:

| format | example | usage |
|--------|---| -------|
| **`name_index`** |`human_10`|Defines step `index` of workflow `name`. Here `name` can be any name with alpha-numeric characters and `-` and `_`. `index` should be a non-negative number.|
| **`name`** |`update-website`| Section name without index is equivalent to `name_0` |
| **`index`** |`10`| Section name without workflow name is equivalent to `default_index`|
| **`pattern_index`** |`*_0`, `human*_10`| Equivalent to step `index` of all matching workflows defined in the script. The `pattern` should follow [Unix filename matching](https://docs.python.org/2/library/fnmatch.html)|
| **`stepname (desc)`**| `10 (align)`| Optional short description can be used to describe the goal of the step|
| **`name1,name2,...`** |`human_10,mouse_10`| Comma separated names define multiple steps for one or more workflows|

A section can have arbitrary Python statements and SoS-specific statements that define the input, output, and dependent targets, and external tasks of the step. These statements starts with keywords `input:`, `output:`, `depends:`, and `task:`. Please refer to [SoS step](SoS_Step.html) for more details about different types of steps, step options and these statements.

For example, the following script defines a workflow with a global section without header, and a workflow `gff` with two steps

```sos
#!/usr/bin/env sos-runner
#fileformat=SOS1.0

local_resource = '~/Resource/'
data_dir       = '~/Data/bams/'
resource_dir   = '${local_resource}/resources/hg19/Ensembl/Genes'

# samples to be processed
parameter: samples = ['s312', 's315', 's312a', 's315a']

[gff_0]
# download gene models from the MISO website
output: '${resource_dir}/Home_sapiens.GRCh37.65.gff.zip'
download: dest_dir=resource_dir, decompress=True
    http://genes.mit.edu/burgelab/miso/annotations/gene-models/Homo_sapiens.GRCh37.65.gff.zip

[gff_1]
# Index gtf file using index_gff
output: '${resource_dir}/${hg19_gff_index}/genes.gff'
task:   working_dir=resource_dir
run:    docker_image='mdabioinfo/miso:latest'
    rm -rf ${hg19_gff_index}
    index_gff --index ${hg19_gff_file} ${hg19_gff_index}
```

## Jupyter notebook format (`.ipynb` )

SoS provides a Jupyter frontend in which you can execute sos steps and workflows interactively or in batch mode. The Jupyter notebook format `.ipynb` can contain **markdown cells** and **code cells** with statements in SoS and any other supported languages (e.g. `R`).

One or more SoS workflows can be defined in a `.ipynb` file from **code cells in SoS kernel that start with section headers**. For example, the same workflow defined in the `.sos` file could be defined in a Jupyter notebook, with cells starting with section headers as follows: 

In [None]:
[global]
local_resource = '~/Resource/'
data_dir       = '~/Data/bams/'
resource_dir   = '${local_resource}/resources/hg19/Ensembl/Genes'

# samples to be processed
parameter: samples = ['s312', 's315', 's312a', 's315a']

In [None]:
[gff_0]
# download gene models from the MISO website
output: '${resource_dir}/Home_sapiens.GRCh37.65.gff.zip'
download: dest_dir=resource_dir, decompress=True
    http://genes.mit.edu/burgelab/miso/annotations/gene-models/Homo_sapiens.GRCh37.65.gff.zip

In [None]:
# This is step 1 of the gff workflow

[gff_1]
# Index gtf file using index_gff
output: '${resource_dir}/${hg19_gff_index}/genes.gff'
task:   working_dir=resource_dir
run:    docker_image='mdabioinfo/miso:latest'
    rm -rf ${hg19_gff_index}
    index_gff --index ${hg19_gff_file} ${hg19_gff_index}

Note that
1. The `[global]` section header is needed for the cell to be recognized as part of a workflow.
2. Section headers can be defined after comments, magics, and empty lines.
3. A cell can define one or more steps, even a complete workflow.

SoS commands such as `sos run` and `sos-runner` can execute workflows defined in `.ipynb` files directly.

## Basic Syntax

SoS is based on the Python 3 programming language. If you are unfamiliar with Python, you can learn some basics of Python, usually in less than half a day, by reading some Python tutorials (e.g. [the official python tutorial](https://docs.python.org/3/tutorial/)). This [short introduction](https://docs.python.org/3/tutorial/introduction.html) is good enough for you to get started.

SoS addes the following syntax to standard Python syntax: 

### String Interpolation

On top of Python's `format` function, SoS uses string interpolation to format strings with expressions. Unlike Python format string, SoS string interpolation **does not require any prefix**, and is **applied to only double quoted strings** (`" "`, `""" """`, `r" "`, and `r""" """`). Compared to python `format` function and the new format string (`f" "`), SoS' string interpolation feature

1. Does not always use the default `__format__` function. For example, `${[1,2]}` would be interpolated as `1 2` instead of `[1, 2]` as a result of `{}'.format([1,2])` or `f'{[1, 2]}`.
2. Allows more string converters, mostly for the conversion of filenames. For example, `'${"filename.txt"!a}'` would return the absolute path name of `filename.txt`.
3. Uses backslash to disable interpolation, and
4. Allows the use of alternative sigils.

Although configurable, the default sigil for SoS string interpolation is `'${ }'`, which means by default any expression between `${` and `}` would be evaluated by SoS. For example, expressions `resource_path`, `sample_names[0]` and `sample_names` would be replaced by their values in variables `ref_genome`, `title`, and `all_names`, but not in `single_quoted` because the string literal is quoted by single quotes. For convenience, we use a magic `%preview` of the SoS kernel to display the values of variables after the evaluation of the cell content.

In [1]:
%preview ref_genome title all_names single_quoted

resource_path = '~/.sos/resources'
ref_genome    = "${resource_path}/hg19/refGenome.fasta"

sample_names  = ['A', 'B', 'C']
title         = "Sample ${sample_names[0]} results"
all_names     = "Samples ${sample_names}"

single_quoted = '${sample_names} is not interpolated'

'~/.sos/resources/hg19/refGenome.fasta'

'Sample A results'

'Samples A B C'

'${sample_names} is not interpolated'

SoS actions specified in **script format** is assumed to be in raw tripple quote and will be interpolated. For example, variable `num` is passed from SoS to a shell script in the following example

In [2]:
import random
num = random.randint(1, 6)
run:
    echo "Random number is ${num}"

Random number is 2


because the code is equivalent to

In [3]:
import random
num = random.randint(1, 6)
run(r"""
echo "Random number is ${num}"
""")

Random number is 4


#### String representation of Objects

SoS evaluate an expression and returns the string representation of the value.

If the value is of simple Python types such as string, boolean, and numbers, the standard Python representation of the value (`repr(obj)`) will be returned.

In [4]:
"${2**10}"

'1024'

In [5]:
user = 'Bob'
"${\"Hi, \" + user}"

'Hi, Bob'

For objects with an iterator interface (e.g. Python `list`, `tuple`, `dict`, and `set`), SoS join the string representation of each item by a space (or comma with `,` conversion flag). For example,

  * List of strings will be converted to a string by joining strings with a space or comma.
  * Dictionary of strings will be converted to a string by joining dictionary keys with no guarantee on the order of values.

In [6]:
names = ['James', 'Bob', 'Kathy']
"${names}"

'James Bob Kathy'

In [7]:
salary = {'James': 20, 'Bob': 25, 'Kathy': 18}
"Employees: ${salary}"

'Employees: Bob James Kathy'

It is worth noting that the step input and output variables (`input`, `output`, `depends`, and its looped version `_input`, `_output`, and `_depends`) are always list of targets. However, if the list contains only one filename, `"${input}"` would be the same as `"${input[0]}"`.

SoS string interpolation supports all string format and conversion specification as in the [Python string format specifier](https://docs.python.org/3/library/string.html#formatspec). That is to say, you can use `: specifier` at the end of the expression to control the format of the output. For example

In [8]:
"${1/3. :.2f}"

'0.33'

In [9]:
filename = 'test.sos'
"${filename:>20}"

'            test.sos'

SoS also extends the conversion operators of the standard Python string format string to give you more control on the string representation of objects, particularly file and directory names. The conversion operators should be specified after a `!` character.

SoS currently supports the following convertors:


| convertor | effect | input | output |
| :----------| :----- | :----- | :-------|
| `s`         | `str()`  | `${file1!s}` | `file 1.txt` |
| `r`         | `repr()`  | `${file1!r}` | `'file 1.txt'` |
| `q`         | `quoted()` | `${file1!q}` | `'file 1.txt'`|
| `e`         | `replace(' ', '\\ ')` | `${file1!e}` | `file\ 1.txt`|
| `a`         | `abspath(expanduser())` |  `${file2!a}` | `/path/to/user/SoS/test.sos` |
| `b`         | `basename())` |  `${file2!b}` | `test.sos` |
| `d`         | `dirname())` |  `${file2!d}` | `/path/to/user/SoS/` |
| `n`         | `splitext()[0]` | `${file2!n}` | `~/SoS/test` |
| `u`         | `expanduser()` | `${file2!u}` | `/path/to/user/SoS/test.sos`|
| `,`         | `','.join()` | `${files!,}` | `a.txt,b.txt`|


here we assume

```
file1='file 1.txt'
file2='~/SoS/test.sos'
files=['a.txt', 'b.txt']
```

For example, if we need to create a file called `Bon Jovi.txt` and run

In [10]:
%sandbox --expect-error
filename = 'Bon Jovi.txt'
run:
    echo "test" > ${filename}
    cat ${filename}

test Jovi.txt


cat: Jovi.txt: No such file or directory
Failed to process statement run(r"""echo "test" > ${filena...name}""")\n (RuntimeError): Failed to execute script (ret=1).
Please use command
	``/bin/bash \
	  /var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmp0q6rm6bx/.sos/interactive_0_0_2b5a234c``
under "/private/var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmp0q6rm6bx" to test it.


We would get two files `Bon` and `Jovi.txt` because the command executed was actually

```
    echo "test" > Bon Jovi.txt
    cat Bon Jovi.txt
```

To avoid such problems, you can quote the filename using the `q` (quoted) convertor

In [11]:
run:
   echo "test" > ${filename!q}
   cat ${filename!q}

test


Depending on how your scripts handle filenames, it can be handy to pass filenames to scripts in expanded format. For example, it would be perfectly OK to pass `~/a.txt` to a shell script, but a `u` convertor should be added if you are passing the filename to a script that does not understand `~` in filenames. SoS makes it easy for you to pass filenames in different forms to underlying scripts. For example,

In [12]:
%preview name filename basefilename expanded parparname
file = '~/sos/examples/update_toc.sos'
name = "${file!n}"
filename = "${file!b}"
basefilename = "${file!bn}"
expanded = "${file!u}"
parparname = "${file!ddb}"

'~/sos/examples/update_toc'

'update_toc.sos'

'update_toc'

'/Users/bpeng1/sos/examples/update_toc.sos'

'sos'

The last example is pretty interesting because it applies three converters and gets the name of grand-parent directory using an equivalence of `basename(dirname(dirname(file)))`.

Finally, the `,` converter can be used to output Python sequences with items separated by comma instead of space. For example, if you are passing a Python list as R literals, you can pass them as follows:

In [13]:
salary = {'James': 20, 'Bob': 25, 'Kathy': 18}
R:
    employee = c(${salary!,r})
    print(employee)

[1] "Bob"   "James" "Kathy"


Here the `r` converter quotes the strings, and `,` converter joints the strings by `,`.

In [14]:
"${salary!,r}"

"'Bob','James','Kathy'"

Although the SoS format specifiers are convenient to use, you are not limited to these rules and can define your own ways to present objects. For example

In [15]:
def r_list(obj):
    return 'c(' + ','.join('{!r}'.format(x) for x in obj) + ')'

"${r_list(salary)}"

"c('James','Kathy','Bob')"

#### Inclusion of sigil in string

If you need to stop SoS from interpolating some expressions in a string, you can 

1. Use single quotes, or
2. precede the SoS sigil with a backslash (`\`)

For example, the following script includes a shell script with shell variable `${file}` that is not interpolated by SoS:

In [16]:
[10]
title = "Sample ${sample_names[0]} results"
run:
    echo ${title}
    for file in a.txt b.txt c.txt
    do
        echo Processing \${file} ...
    done


Sample A results
Processing a.txt ...
Processing b.txt ...
Processing c.txt ...


#### Alternative sigil

If your SoS script contains long bash, perl, or other scripts in which `${ }` are frequently used, it can be tedious and error prone to backquote all sigils in these script. In this case you can assign an alternative sigil to the steps using a `sigil` section option.

For example, the example above could be written as 

In [17]:
[10: sigil='%( )']
title = "Sample %(sample_names[0]) results"
run:
    echo %(title)
    for file in a.txt b.txt c.txt
    do
        echo Processing ${file} ...
    done

Sample A results
Processing a.txt ...
Processing b.txt ...
Processing c.txt ...


to use an alternative sigil for this particular step using a step option `sigil`.

You can define any sigil as long as it has a left sigil and a right sigil separated by a space. You can even use sigils with identical left and right sigil (e.g. `# #`), although the latter is more prone to errors

### Script style function

SoS allows you to write SoS `action` (basically a Python function) that accept a script (string) as the first parameter in a special script format. For example,

```sos
R("""
pdf('${input}')
plot(0, 0)
dev.off()
""", workdir='result')
```

can be written as

```sos
R:     workdir='result'
pdf('${_input}')
plot(0, 0)
dev.off()
```

**The script is a string without quotation marks** and the normal string interpolation will take place. You can also indent the script (add equal amount of leading white spaces to all lines) and write the action as

```sos
R:  workdir='result'
   pdf('${_input}')
   plot(0, 0)
   dev.off()
```

The latter is much preferred because it avoids trouble if your script contains strings such as `[1]` and `option:` (and be treated as SoS directives), and more importantly, allows starting a new statement from a non-indented line. For example, `print('Hello world')` would be considered part of a R script in

```sos
R:  workdir='result'
pdf('${_input}')
plot(0, 0)
dev.off()

print('Hello world')
```

but a separate statement in 

```sos
R:  workdir='result'
   pdf('${_input}')
   plot(0, 0)
   dev.off()

print('Hello world')
```

Although the script format is more concise and easier to read, it is limited to actions that accept a string as its first parameter and cannot return value or be used within `try/except` of `if/else` statements.

## Workflow Specification

### Forward-style workflows

A SoS forward-style workflow has a name and one or more numbered steps. The workflows are defined from sections in a SoS script. 

For example, the following sections specify a workflow with four steps `5`, `10`, `20`, and `100`. The workflow steps can be specified in any order and do not have to be consecutive.

In [3]:
[5]
[20]
[10]
[100]

A workflow specified in this way is the **`default`** workflow and is actually called `default` in SoS output. You can specify a workflow with name and give each step a short description as follows:

In [4]:
[mapping_5 (get data)]
[mapping_20 (align)]
[mapping_10 (quality control)]
[mapping_100 (generate report)]

A SoS script can define multiple workflows. For example, the following sections of SoS script defines two workflows named ``mouse`` and ``human``. 

In [13]:
%run mouse
[mouse_10]
[mouse_20]
[mouse_30]
[human_10]
[human_20]
[human_30]

In this case, a command line option is needed to specify workflow name. This can be done by magic `%run` in Jupyter notebook, or a positional argument from the command line, e.g.

```
    % sos run myscript mouse
```

Note that the workflow argument is not needed if a `default` workflow is defined in the script like the following example

In [14]:
[10]
[20]
[30]
[test_10]
[test_20]
[test_30]

In [None]:
Multiple steps can share a single step as follows

In [16]:
%run mouse
[mouse_10,human_10]
[mouse_20]
[human_20]
[mouse_30,human_30]

and wildcard steps can be used to define a step for multiple workflows:

In [17]:
%run mouse
[*_10]
[mouse_20]
[human_20]
[*_30]

If the steps defined in a shared section is similar but not identical, it can use step variable (discussed elsewhere) `step_name` to behave differently in different workflows. In the following example, the variable `step_name` will be `mouse_20` or `human_20` depending on the workflow being executed, and is used to determine the correct reference genome for different workflows.

In [16]:
%run mouse -v2
[mouse_20,human_20]
reference = "/path/to/mouse/reference" if \
  step_name.startswith('mouse') else "/path/to/human/reference"

print("Reference genome ${reference} is used")

INFO: Executing [32mmouse_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m


Reference genome /path/to/mouse/reference is used


### Sub- and combined workflows

Although workflows are defined separately with all their steps, they do not have to be executed in their entirety. A `subworkflow` refers to a workflow that is defined from one or more steps of an existing workflows. It is specified using syntax `workflow:[from-to]` where `from-to` can be `n` (step `n`), `-n` (up to `n`), `n-m` (step `n` to `m`) and `m-` (from `m`). For example

  ```python
  A              # complete workflow A
  A:5-10         # step 5 to 10 of A
  A:50-          # step 50 up
  A:-10          # up to step 10 of A
  A:10           # step 10 of workflow A
  ```

In practice, the `-n` format is frequently used to execute part of the workflow for debudding purposes, for example:

In [17]:
%run default:-20
[10]
[20]
[30]

INFO: Executing [32mdefault_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32mdefault_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m


You can also combine subworkflows to execute multiple workflows one after another. For example,

```python
A + B          # workflow A, followed by B
A:0 + B        # step 0 of A, followed by B
A:-50 + B + C  # up to step 50 of workflow A, followed by B, and C
```

This syntax can be used from the command line, e.g.

```bash
sos-runner myscript align+call
```

or from the `%run` magic of Jupyter notebook

In [18]:
#local run
%run check+align+call
[check_10]
[align_10]
[align_20]
[call_10]
[call_20]

INFO: Executing [32mcheck_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32malign_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32malign_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32mcall_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32mcall_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m


It is worth noting that combined workflow might work differently from when they are executed separately (e.g. default input of `B` is changed from empty to output of `A_0`), and it is up to the user to resolve conflicts between them.

### Nested workflow

SoS also supports nested workflow in which a complete workflow is treated as part of a step process.
The workflow is execute by SoS action `sos_run`, e.g.

```python
sos_run('A')            # execute workflow A
sos_run('A + B')        # execute workflow B after A
sos_run('D:-10 + C')    # execute up to step 10 of D and workflow C

# execute user-specified aligner and caller workflows
sos_run('${aligner} + ${caller}')  
```

In its simplest form, nested workflow allows you to define another workflow from existing ones. For example,

In [8]:
%run -v2
[align_10]
[align_20]
[call_10]
[call_20]
[default]
sos_run('align+call')

INFO: Executing [32mdefault_0[0m: 
INFO: input:    [32m[][0m
INFO: Executing workflow [32malign+call[0m with input [32m[][0m and no args
INFO: Executing [32malign_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32malign_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32mcall_10[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Executing [32mcall_20[0m: 
INFO: input:    [32m[][0m
INFO: output:   [32m[][0m
INFO: Workflow align+call (ID=bae5d90c45edd864) is executed successfully.
INFO: output:   [32m[][0m
INFO: Workflow default (ID=37dcbb4fce3d5473) is executed successfully.


defines a nested workflow that combines workflows `align` and `call` so that the workflow will by default execute two workflows, but can also execute one of them as separate workflows `align` and `call`.

Nested workflow also allows you to define multiple mini-workflows and connect them freely. For example

```python
[a_1]
[a_2]
[b]
[c]
[d_1]
sos_run('a+b')
[d_2]
sos_run('a+c')
```

defines workflows `d` that will execute steps `d_1`, `a_1`, `a_2`, `b_0`, `d_2`,  `a_1`, `a_2`, and `c_0`. 

Because `sos_run` is simply a SoS action and takes a string as its parameter, it allows very flexible ways to compose (e.g. determine workflow from command line arguments) and execute (e.g. repeated execution of workflows with different options or input files) complex workflows.

### Makefile-style workflow

Using [auxiliary steps](Auxiliary_Steps.html) that are only executed to provide desired output, a SoS workflow can consist of a graph with or without a "stem" with numbered forward-style steps. By specifying the targets of a workflow instead of which steps to execute, you essentially let SoS execute the required steps to generate the targets. For example,

In [18]:
%sandbox

!touch test.bam
%run -t test.vcf

# this step provides variable `var`
[index: provides='{filename}.bam.bai']
input: "${filename}.bam"
sh:
   echo "Generating ${output}"
   touch ${output}

[call: provides='{filename}.vcf']
input:   "${filename}.bam"
depends: "${input}.bai"
sh:
   echo "Calling variants from ${input} with ${depends} to ${output}"
   touch ${output}

Generating test.bam.bai
Calling variants from test.bam with test.bam.bai to test.vcf


In this example, instead of specifiying a workflow, a target `test.bam.bai` is requested. SoS checks all auxiliary steps and calls step `index` to generate `test.bam.bai`. After step `index` is completed, step `call` is executed again to produce the final requested target `test.vcf`.

The `-t` option could specify more than one targets and could be used in combination with a forward-style workflow. Please refer to [documentation on makefile-style workflows](Auxiliary_Steps.html) for more details.