# Understanding SoS action

## SoS Action

Although arbitrary python functions can be used in SoS step process, SoS defines some **`actions`** that can be used in a SoS script. These fucntions accept a common set of **parameters** that determines when and how these functions would be called.

For example, function `time.sleep(5)` would be executed in run mode,

In [1]:
[0]
import time
st = time.time()
time.sleep(1)
print('I just slept {:.2f} seconds'.format(time.time() - st))

I just slept 1.00 seconds


and also in dryrun mode (option `-n`),

In [2]:
%run -n
[0]
import time
st = time.time()
time.sleep(1)
print('I just slept {:.2f} seconds'.format(time.time() - st))

I just slept 1.00 seconds


because these statements are regular Python functions. However, if you put the statements in an action `python`, the statements would be executed in run mode,

In [3]:
[0]
python:
    import time
    st = time.time()
    time.sleep(1)
    print('I just slept {:.2f} seconds'.format(time.time() - st))

I just slept 1.00 seconds


but not executed in dryrun mode (option `-n`)

In [4]:
%run -n
[0]
python:
    import time
    st = time.time()
    time.sleep(1)
    print('I just slept {:.2f} seconds'.format(time.time() - st))

## Action options

Actions define their own parameters but their execution is controlled by a common set of options.

### `input`

Parameter `input` specifies the input files that an action needs before it can be executed. This parameter is processed by SoS before an action is called and SoS will try to generate the file if an input target does not exist.

For example,

In [5]:
%sandbox
[10]
output: 'a.txt'
bash:
    echo 'content of a.txt' > a.txt

[20]
report: input='a.txt'

content of a.txt



In [6]:
%sandbox
[a: provides='a.txt']
bash:
    echo 'content of a.txt' > a.txt

[20]
report: input='a.txt'

content of a.txt



Although both the `input` parameter and the `input` statement of a step can affect the execution of workflow (change DAG), they differ in that `input` statement defines variables `input` and `_input`, `input` parameter does not define any. 

### `output`

Similar to `input`, parameter `output` defines the output of an action, which can be a single name (or target) or a list of files or targets. SoS would check the existence of output after the completion of the action. For example, 

In [7]:
%sandbox --expect-error
[10]
bash: output='a.txt'

Failed to process statement 'bash(r"""""", output=\'a.txt\')\n': Target a.txt does not exist after execution of action


Parameter `output`  does not change the step variable `output` and thus affect the default input file of the next step. The targets specified by this parameter will be tracked, so this is a great way to track additional files from a step. For example,

In [8]:
%sandbox
%preview summary.md
[10]
output: 'a.txt'
run:
    echo 'a' > a.txt

report: output='a.md'
  this is step 10

[20]
print("My input is ${input}")
report: input='a.md', output='summary.md'
   Summary report

My input is a.txt


As you can see, the `output` variable of step 10 (and thus input of step 20) is defined by the output statement, which is not affected by parameter `output='a.md'`.

###  `active`

Action option `active` is used to activate or inactivate an action in an input loop. Basically, when a loop is defined by `for_each` or `group_by` options of `input:` statement, an action after input would be repeated for each input group. The `action` parameter accepts an integer, either a non-negative number, a negative number (counting backward), a sequence of indexes, or a slice object, for which the action would be active.

For example, for an input loop that loops through a sequence of numbers, the first action `run` is executed for all groups, the second action is executed for even number of groups, the last action is executed for the last step.

In [9]:
seq = range(5)
input: for_each='seq'
run:
   echo I am active at all groups ${_index}
run: active=slice(None, None, 2)
   echo I am active at even groups ${_index}
run: active=-1
   echo I am active at last group ${_index}

I am active at all groups 0
I am active at even groups 0
I am active at all groups 1
I am active at all groups 2
I am active at even groups 2
I am active at all groups 3
I am active at all groups 4
I am active at even groups 4
I am active at last group 4


### `workdir`

Option `workdir` changes the current working directory for the action, and change back once the action is executed. The directory will be created if it does not exist.

In [10]:
bash: workdir='tmp'
   touch a.txt
bash:
    ls tmp
    rm tmp/a.txt
    rmdir tmp

a.txt


### `docker_image`

If a docker image is specified (either a name, an Id, or a file), the action is assumed to be executed in the specified docker. The image will be automatically downloaded (pulled) or loaded (if a `.tar` or `.tar.gz` file is specified`) if it is not available locally. 

For example, executing the following script 

```
[10]
python3: docker_image='python'
  set = {'a', 'b'}
  print(set)
  ```

under a docker terminal (that is connected to the docker daemon) will

1. Pull docker image `python`,  which is the official docker image for Python 2 and 3.
2. Create a python script with the specified content
3. Run the docker container `python` and make the script available inside the container
4. Use the `python3` command inside the container to execute the script.

Additional `docker_run` parameters can be passed to actions when the action
is executed in a docker image. These options include

* `name`: name of the container (option `--name`)
* `tty`: if a tty is attached (default to `True`, option `-t`)
* `stdin_open`: if stdin should be open (default to `False`, option `-i`)
* `user`: username (default o `root`, option `-u`)
* `environment`: Can be a string, a list of string or dictinary of environment variables for docker (option `-e`)
* `volumes`: string or list of string, extra volumes that need to be link, in addition to SoS mounted (`/tmp`, `/Users` (if mac), `/Volumes` (if [properly configured](http://vatlab.github.io/SOS/doc/tutorials/SoS_Docker_Guide.html) under mac) and script file)
* `volumes_from`: container names or Ids to get volumes from
* `working_dir`: working directory (option `-w`), default working directory, or working directory set by runtime option `workdir`.
* `port`: port opened (option `-p`)
* `extra_args`: If there is any extra arguments you would like to pass to the `docker run` process (after you check the actual command of `docker run` of SoS

### `docker_file`

This option allows you to import a docker from specified `docker_file`, which can be an archive file (`.tar`, `.tar.gz`, `.tgz`, `.bzip`, `.tar.xz`, `.txz`) or a URL to an archive file (e.g. `http://example.com/exampleimage.tgz`). SoS will use command `docker import` to import the `docker_file`. However, because SoS does not know the repository and tag names of the imported docker file, you will still need to use option `docker_image` to specify the image to use.

It is easy to define your own actions. All you need to do is to define a function and decorate it with a `SoS_Action` decorator. For example

```python
from pysos import SoS_Action

@SoS_Action(run_mode=('run', 'interactive'))
def my_action(parameters):
    do_something_with_parameters
	return 1
```

###  `args`

All script-executing actions accept an option `args`, which changes how the script is executed.

By default, such an action has an `interpreter` (e.g. `bash`), a default `args='${filename!q}'`, amd the script would be executed as `interpreter args`, which is
```
bash ${filename!q}
```
where `${filename!q}` would be replaced by the temporary script file.

If you would like to change the command line with additional parameters, or different format of filename, you can specify an alternative `args`, with variables `filename` (filename of temporary script) and `script` (actual content of the script).

For example, option `-n` can be added to command `bash` to execute script in dryrun mode

In [11]:
bash: args='-n ${filename!q}'
    echo "-n means running in dryrun mode (only check syntax)"

and you can actually execute a command without filename, and instead executing the script directly from command line

In [12]:
python: args='-m timeit ${script}'
    '"-".join(str(n) for n in range(100))'

10000 loops, best of 3: 35 usec per loop


### allow_error

Option `allow_error` tells SoS that the action might fail but this should not stop the workflow from executing. This option essentially turns an error to a warning message and change the return value of action to `None`. 

For example, in the following example, the wrong shell script would stop the execution of the step so the following action is not executed.

In [15]:
%sandbox --expect-error
run: 
    This is not shell
print('Step after run')

/var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmpi62wxece: line 1: This: command not found
Failed to process statement run(r"""This is not shell\n""")...fter run'): Failed to execute script (ret=127). 
Please use command
    /bin/bash /var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmpzn3zpjx3/.sos/interactive_0_0
under /private/var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmpzn3zpjx3 to test it.


but in this example, the error of `run` action is turned to a warning message and the later step would still be executed.

In [16]:
run: allow_error=True
    This is not shell
print('Step after run')

/var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmpag5371q4: line 1: This: command not found
Please use command
    /bin/bash /var/folders/ys/gnzk0qbx5wbdgm531v82xxljv5yqy8/T/tmpzn3zpjx3/.sos/interactive_0_0
under /Users/bpeng1/SOS/docs/src/documentation to test it.[0m


Step after run
