# Executing external tasks

<font color='red'>This document is still under development.</font>

## External `task`

If a job is long and time consuming, it is much preferred to submit them as separate tasks to be executed, for example, on a cluster system. These jobs should be specified using the `task` keyword, which marks the beginning of a task, with optional runtime options to control its execution. For example,

```
[10]
input: group_by='single'

task: concurrent=True

run('''
samtools index {_input}
''')
```

execute a shell script in parallel (with `concurrent=True`). The step process can consists of arbitrary python statements and execute multiple step actions. For example,

```python
task:
try:
   action1()
except RuntimeError:
   action2()
```

execute `action1` and `action2` if `action1` raises an error.

```python
task:
for par in ['-4', '-6']:
   run('command with ${par}')
```

executes commands in a loop. This is similar to

```
pars = ['-4', '-6']
input: for_each=pars
task:
run('command with ${_pars}')
```

but the `for` loop version would not be able to be executed in parallel. Note that SoS actions can be used outside of `step process` but only statements specified after the `process` keyword can have runtime options and be executed in separate processes. That is to say,

```
pars = ['-4', '-6']
input: for_each=pars
run('command with ${_pars}')
```

is equivalent to

```
pars = ['-4', '-6']
input: for_each=pars
task:
run('command with ${_pars}')
```

but the latter can have additional runtime options to run commands in parallel

```
pars = ['-4', '-6']
input: for_each=pars
task: concurrent=True
run('command with ${_pars}')
```

Because step tasks are executed outside of SoS, variables assigned in step tasks are not accessible to SoS. For example,

```
[10: shared='res']
res = some_action()
```

executes `some_action()` in step process and return its result as a shared variable `res`. The following script,

```
[10: shared='res']
task:
res = some_action()
```

however, does not work because `res` is assigned in step task and is not accessible from the step.

### Option `workdir`

Default to current working directory.

Option `workdir` controls the working directory of the process. For example, the following step downloads a file to the `resource_dir` using command `wget`.

```python
[10]

run: workdir=resource_dir

  wget a_url -O filename

```

### Option `concurrent`

Default to `False`.

If the step process is repeated for different input files or parameters (using input options `group_by` or `for_each`), the loop process can be execute in parallel, up to the maximum number of concurrent jobs specified by command line option `-j`.

### Option `env`

The `env` option allow you to modify runtime environment, similar to the `env` parameter of the `subprocess.Popen` function. For example, you can execute your command with in a specific directory using

```
task:  env={'PATH': '/path/to/mycommand' + os.sep + os.environ['PATH']}
run:
   mycommand 
```

### Option `prepend_path`

Option `prepend_path` is a shortcut to option `env` to prepend one (a string) or more (a list of strings) paths to system path. For example, the above example can be shortened to

```
task:  prepend_path='/path/to/mycommand'
run:
   mycommand 
```

### Option `walltime`