# How to execute workflow to generate specific output

* **Difficulty level**: intemediate
* **Time need to lean**: 20 minutes or less
* **Key points**:
  * Instead of executing workflows, you can use option `-t` to specify targets to generate
  * Targets can be filenames or names of named output
  * Steps that provides targets can be process-oriented steps with static or named outputs, or outcome-oriented with pattern matching
  

## Outcome-oriented workflows

 <div class="bs-callout bs-callout-primary" role="alert">
    <b>Outcome-oriented workflows</b> aims at generating specified outcomes. For example, <code>sos run script -t result.html</code> would execute any step needed to genearte <code>result.html</code>.
 </div>

**Outcome-oriented** workflows aim at generating particular outcomes. The essential features include 

1. The workflow consists of steps that provides "outcome" for other steps
2. The workflow is triggered by the generation of particular outcome

## Steps that `provides` targets through pattern matching

A typical makefile-style [auxiliary steps](auxiliary_step.html) with a section header with a `provides` option, which usually contains a pattern. The pattern serves three purposes

1. It triggers the step through pattern matching. For example, `provides='{filename}-{idx}.txt'` matches `a-1.txt`, `filename-23.txt`, but not `filename.txt`.
2. It defines variables specified in the pattern. For the `{filename}-{idx}.txt` example, `a-1.txt` would generate variables `filename='a'` and `idx='1'`.
3. It defines a default output to the step, which would be `output: 'a-1.txt'` if the step is matched to filename `a-1.txt`. Note that the default output could be overriden by an explicit statement.

For example, the following workflow consists of one [auxiliary step](auxiliary_step.html) that generates `.csv` file from `.xlsx` file. The workflow is triggered by option `-t data/DEG.csv` (target) so a command

```
xlsx2csv data/DEG.xlsx data/DEG.csv
```
is executed to generate it. The `data/DEG` part is determined by pattern matching which assigns `filename='data/DEG'`.

In [1]:
!rm -f data/DEG.csv
%run -v1 -t data/DEG.csv

[convert: provides='{filename}.csv']
input: f'{filename}.xlsx'
run: expand=True
    xlsx2csv {_input} > {_output}

[91mERROR[0m: [91mNo step to generate target data/DEG.xlsx, needed by 'convert'[0m


## Steps with simple output

If the output of a step is "simple", in the sense that it can be determined by itself without referring to any definition in the global section or the `input` of the step, it automatically `provides` the output to other steps. 

For example, the `plot` step added to the previous workflow does not have a `provides` statement. It has a simple `output` statement with `figure.pdf`. Then, when the workflow is triggered by `-t figure.pdf`, the `plot` step will be triggered, which, in this case, also triggers `convert` because of the missing `data/DEG.csv`.

In [2]:
!rm -f figure.pdf data/DEG.csv

%run -t figure.pdf
[convert: provides='{filename}.csv']
input: f'{filename}.xlsx'
run: expand=True
    xlsx2csv {_input} > {_output}

[plot]
input: 'data/DEG.csv'
output: 'figure.pdf'
R: expand=True
    data <- read.csv('{_input}')
    pdf('{_output}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

[91mERROR[0m: [91mNo step to generate target data/DEG.xlsx, needed by 'convert'[0m


## Generating outputs with named output

If the output is not simple (e.g. involves global definitions or parameters), or it contains multiple targets, you can give it a name through keyword argument (see [named output](named_output.html) for details). You can then specify the name to option `-t` instead of the actual filename(s).

For example, by assigning output to a name `figure`, the following workflow could be triggered with `-t figure`.

In [3]:
!rm -f figure.pdf data/DEG.csv

%run -t figure

plot_input = 'data/DEG.csv'
plot_output = 'figure.pdf'

[convert: provides='{filename}.csv']
input: f'{filename}.xlsx'
run: expand=True
    xlsx2csv {_input} > {_output}

[plot]
input: plot_input
output: figure=plot_output
R: expand=True
    data <- read.csv('{_input}')
    pdf('{_output}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

[91mERROR[0m: [91mNo step to generate target data/DEG.xlsx, needed by 'convert'[0m


## Further reading

* [How to define named inputs and outputs](doc/user_guide/target_label.html)