# Variables and parameters

* **Difficulty level**: easy
* **Time need to lean**: 10 minutes or less
* **Key points**:
  * SoS (Python) variables can be used to compose scripts in different languages as Python f-strings
  * A `parameter` statement defines a parameter that can be passed from command line

## Global and local variables

Now let us have a look at the example workflow from [our SoS overview](sos_overview.html) in more detail. 

In [1]:
%run

[global]
excel_file = 'data/DEG.xlsx'
csv_file = 'DEG.csv'
figure_file = 'output.pdf'

[plot_10]
run: expand=True
    xlsx2csv {excel_file} > {csv_file}

[plot_20]
R: expand=True
    data <- read.csv('{csv_file}')
    pdf('{figure_file}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

xlsx2csv data/DEG.xlsx > DEG.csv





null device 


          1 


This workflow has a `global` section, which defines variables that are visible to all workflow steps. The three variables are available in `plot_10` and `plot_20`, so they can be used in actions `run` and `R` with the option `expand=True`. More explicitly, the `plot_10` can be considered as the following python script

```python
excel_file = 'data/DEG.xlsx'
csv_file = 'DEG.csv'
figure_file = 'output.pdf'

run(f'''\
xlsx2csv {excel_file} > {csv_file}
''')
```

<div class="bs-callout bs-callout-primary" role="alert">
  <b>The global section</b><br>
  <p>The content of the global section can be considered as part of all workflow steps</p>  
</div>

In contrast, **variables defined in individual steps are not available to other steps**. For example, the following workflow would fail because `csv_file` is defined locally in step `plot_10`.

In [2]:
%env --expect-error

%run

[plot_10]
excel_file = 'data/DEG.xlsx'
csv_file = 'DEG.csv'

run: expand=True
    xlsx2csv {excel_file} > {csv_file}

[plot_20]
figure_file = 'output.pdf'

R: expand=True
    data <- read.csv('{csv_file}')
    pdf('{figure_file}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

xlsx2csv data/DEG.xlsx > DEG.csv





[91mERROR[0m: [91m[plot_20]: [0]: 


---------------------------------------------------------------------------


NameError                                 Traceback (most recent call last)


script_6218156210693059555 in <module>


      plot(data$log2FoldChange, data$stat)


      dev.off()


----> """)


      





NameError: name 'csv_file' is not defined[0m


RuntimeError: Workflow exited with code 1

<div class="bs-callout bs-callout-warning" role="alert">
  <b>Local (step-level) variables</b><br>
  <p>Variables defines at the step level are local to the step and are not accessible from other SoS steps.</p>  
</div>

If you really need to pass locally defined variables to other steps, you will have to return it as the part of the result of the output, or explicitly share the variable with others. Please refer to the [Further reading](#further_reading) section of this tutorial for details.

## Workflow parameters <a id="parameter"></a>

SoS allows you to define parameters that accept values from command line options.  

In [3]:
%run --excel-file data/DEG.xlsx

[global]
parameter: excel_file = str
parameter: csv_file = 'DEG.csv'
parameter: figure_file = 'output.pdf'

[plot_10]
run: expand=True
    xlsx2csv {excel_file} > {csv_file}

[plot_20]
R: expand=True
    data <- read.csv('{csv_file}')
    pdf('{figure_file}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

xlsx2csv data/DEG.xlsx > DEG.csv





null device 


          1 


In the above example, three parameters `excel_file`, `csv_file`, `figure_file` are defined. Parameter `excel_file` is required and is specified as an command line option of the `%run` magic. The other two parameters have their default values. Note that parameter `excel_file` can be specified as both `--excel_file` or `--excel-file` from command line.