# Generating Reports using SoS

There are multiple ways to generate reports from the execution of SoS workflows. This tutorial introduces some basic methods but you can certainly be more creative.

## A summarization step

You can write a report at the end of the workflow that summarizes the results of previous steps. For example, in the following example, the `report` action summarizes previous steps and writes a report to the standard output.

In [1]:
%sandbox
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

[20]
output: 'a.txt'
run:
    echo "100" > a.txt

[100]
input: 'a.jpg', 'a.txt'
with open('a.txt') as a:
    res = a.read()

report:
    * Figure
    ![figure](a.jpg)
    * result
    ${res}
   

* Figure
![figure](a.jpg)
* result
100



It is a pretty bad idea to write report to standard output because other actions can also write to it. You can use either the `output` option of the `report` action or a command line option `-r` to specify an output file of the `report` action. Moreover, if there would be a lot of processing to get summary information, you can separate them into different steps. For example, an auxiliary step is used to extract information from the output of step `20`. As you can see, the `%preview` magic automatically render `.md` file.

In [2]:
%sandbox
%preview summary.md
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

[20]
output: 'a.txt'
run:
    echo "100" > a.txt

[counts: shared='counts']
input: 'a.txt'
with open('a.txt') as a:
    counts = a.read()

[100]
input: 'a.jpg'
depends: sos_variable('counts')

report: output='summary.md'
    * Figure
    ![figure](a.jpg)
    * result
    ${counts}
   

Instead of outputting reports in `.md` format and rendering them outside of SoS, you can also render them inside SoS using action `pandoc` or `Rmarkdown`. For example,

In [3]:
%sandbox
%preview summary.html
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

[20]
output: 'a.txt'
run:
    echo "100" > a.txt

[counts: shared='counts']
input: 'a.txt'
with open('a.txt') as a:
    counts = a.read()

[100]
input: 'a.jpg'
depends: sos_variable('counts')

pandoc: output='summary.html'
    * Figure
    ![figure](a.jpg)
    * result
    ${counts}

## Reporting to report output

Write a single reporting step is convenient in that you only need to write one (large) template and try to feed the template with all required information. If you workflow is long, or if some of the steps are optional, or if you prefer writing reports immediately after a step is completed (so that your steps and reports are close to each other), you can report results piece by piece.

The simplest method to write such a report is as follows:

In [4]:
%sandbox
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

report:
    * Figure
    ![figure](${output})    

[20]
output: 'a.txt'
counts = 100
run:
    echo "${counts}" > a.txt

report:
    * result
    ${counts}



* Figure
![figure](a.jpg)

* result
100



100


As you can see, you can report the result of a step immediately after it is available and no summary step is needed because all reports have been written to standard output. To write the report other than standard output, you can use the `-r` option. 

In [5]:
%sandbox
%preview summary.md
%run -r summary.md
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

report:
    * Figure
    ![figure](${output})    

[20]
output: 'a.txt'
counts = 100
run:
    echo "${counts}" > a.txt

report:
    * result
    ${counts}


100


You cannot use the `output` option to write to the same file because the file would be overwritten.

## Reporting to multiple output files

The previous approach assumes that all steps would be executed, and be executed sequentially. However,

1. As you have learned in other sections, SoS steps can be executed in parallel, and do not have to be executed in order because a later step can be executed before a previous one if it does not depend on it. 
2. Some steps of a workflow might be skipped if they have been executed before so re-executing a workflow might result in an incomplete report. 
3. This method makes it impossible to catch and process the report inside SoS so you can not use actions `pandoc` or `Rmarkdown` to generate nice looking reports.

Because of all these problems, it makes more sense to write reports to separate files during evolution, and collect them at the end. For example, in the following example, an output file is specified for each `report` and a summary step is used at the end to collect and process them.

In [6]:
%sandbox --dir ~/tmp
%preview summary.html
[10]
output: 'a.jpg'
R:
    jpeg(${output!r})
    cars <- c(1, 3, 6, 4, 9)
    barplot(cars)

report: output='figure.md'
    * Figure
    ![figure](${output})    

[20]
output: 'a.txt'
counts = 100
run:
    echo "${counts}" > a.txt

report: output='result.md'
    * result
    ${counts}

[100]
pandoc: input=['figure.md', 'result.md'], output='summary.html'
    Final report

You will notice that both `script` and `input` are specified to action `pandoc`. In this case the `script` is put before the content of each `input` files, making it a perfect place to write headers and summaries.