# DSC Basics, Part II

This is the second part of the "DSC Basics" tutorial. Before working through this tutorial, you should have already read [DSC Basics, Part I](Intro_Syntax_I.html). Here we build on the mean estimation example from the previous part to illustrate new concepts and syntax in DSC, with an emphasis on the use of *module parameters*.

Materials used in this tutorial can be found in the [DSC vignettes repository](https://github.com/stephenslab/dsc/tree/master/vignettes/one_sample_location). As before, you may choose to run this example DSC program as you read through the tutorial, but this is not required.

## Defining a module parameter

In our example DSC, recall we defined the `normal` module as follows:

```
normal: R(x <- rnorm(n = 100,mean = 0,sd = 1))
  $data: x
  $true_mean: 0
```

In defining this module, we made an unfortunate design decision: the mean used to simulate the data is defined twice, once inside the call to `rnorm`, where we set `mean = 0`, and once when we set the module output `$true_mean` to zero. If we decided to use a different mean to simulate the data, then we would have to be careful to change the code in two different places, otherwise this would lead to downstream errors. It would be preferrable if we had a single variable that defines the mean of the data.

In DSC, this can be done with a *module parameter*:

```
normal: R(x <- rnorm(n = 100,mean = mu,sd = 1))
  mu: 0
  $data: x
  $true_mean: mu
```

Here, we defined a module parameter `mu`, and set it to the value zero. Once we have defined module parameter `mu`, any R code and module outputs within the `normal` module may refer to this module parameter. For example, here the second argument of `rnorm` is set to the value of `mu`.

Now modifying the mean used to simulate the data only requires changing one line of code.

Similarly, we can use a module parameter to specify the mean of the data that are simulated from a $t$ distribution:

```
t: R(x <- mu + rt(n = 100,df = 2))
  mu: 3d
  $data: x
  $true_mean: mu
```

Note that there is no requirement that the module parameters for the `normal` and `t` modules have the same name, `mu`.

## The order of evaluation inside a module

Below, we will give some more elaborate examples using module parameters. Before continuing, please keep in mind these two points:

+ A module parameter cannot depend on any of the module inputs or any other module parameters. In other words, it must be possible to evaluate the module parameter without knowing the values of the module inputs or the values of the other module parameters.

+ Module parameters are evaluated *before* the module script. The exact procedure for evaluating a module is as follows:
 
    1. Evaluate any R code used to determine the values of the module parameters.
    2. Set the values of the module parameters.
    3. Initialize the module inputs according to the current stored values of the pipeline variables.
    4. For each module parameter and module input, define a script variable with the same name and same value. 
    5. Evaluate the module script or inline source code. Any variable defined within the global environment in which the script is evaluated is retained for resolving any module outputs.
    6. Set each module output to the stored value of the selected script variable.

This evaluation procedure will become more clear as we illustrate it in some of the examples below.

## A single module parameter with multiple *alternative* values

In the example above, we showed that module parameters are useful for simplifying a module definition by defining a variable that is used in multiple places inside the module. Here we will see that module parameters have another powerful feature: they can be used to define multiple similar modules.

Currently, in the `normal` module, we simulate 100 random samples from a normal distribution. Suppose we would like to define a second module that simulates 1,000 random samples from the same normal distribution. This is easily done by defining a new module parameter `n` that takes on two different values:

```
normal: R(x <- rnorm(n,mean = mu,sd = 1))
  mu: 0
  n: 100, 1000
  $data: x
  $true_mean: mu
```

The comma delimits the two possible values of model parameter `n`.

Now that we have defined `n` inside this module, we can refer to this module parameter inside the R code that simulates random draws from a normal distribution.

This is equivalent to defining two modules, `normal_100` and `normal_1000`, that are identical in every way except that the first module includes parameter definition `n: 100` and the second defines `n: 100`. The code above is of course much more succinct.

The values 100 and 1000 are *alternative values* for module parameter `n`; that is, the definition of `n` should not be interpreted as a vector or sequence with two entries, 100 and 1000. Instead, it should be understood as defining a *set of alternative values*, in which `n` within any given module is set to 100 or 1000. To put it another way, `n: 100, 1000` defines two *alternative values* for module parameter `n`, and likewise defines two *alternative modules* that are the same in every way except for the setting of `n`.

A distinguishing property of module parameters with multiple alternative values is that *their order does not matter*. For example, if we instead wrote `n: 1000, 100`, *the DSC results will be exactly the same*. (The only thing that will change is the order in which the results will appear in the tables, and the way in which the results are stored in files.)

Now that we have defined two `normal` modules, there is also the question of how these modules can be distinguished in the results; for example, you may want to compare the accuracy of the estimates in the larger and smaller simulated data sets. Next we will see that the results from these two modules can be selected according to the stored value of the module parameter `n`.

## Executing the DSC with two alternative `normal` and `t` modules

Here we examine how the new DSC works in practice.

+ Run the DSC.

+ The two modules have the same name, but they can be identified in the results by different values for `n`.

## Two module parameters with multiple alternatives

If you provide more than one value for multiple parameters, DSC considers all combinations. For example:

```
t: R( x = rt(n, df) )
  n: 100, 1000
  df: 2, 4, 10
  $data: x
  $true_mean: 0
```

defines a group of six modules, which simulate 100 or 1000 observations from a $t$ distributions with 2, 4 or 10 degrees of freedom.

TO DO: Show the results from running this DSC and querying the results.

## Executing and querying the DSC when module alternatives are defined using two module parameters

Run the DSC program and show the results here.

## Setting the seed with a module parameter

A common use of module parameters is to initialize the state, or "seed", for generating a sequence of pseudorandom numbers. (This section is not essential, and may be skipped if you are pressed for time.)

## Combining module parameters with module inputs

NOTE: trim is a good example to illustrate how it cannot be determined based on the data input.

Now consider the following DSC syntax:

```
trim_mean: R( y = mean(x, trim) )
  trim: 0, 0.2
  x: $data
  $est_mean: y
```

defines two modules which compute the trimmed mean of the pipeline variable
`$data`, with different levels of trimming (0 or 0.2), and output
the result as the pipeline variable `$est_mean`.

Schematically we might represent this as:

```
# trim_mean(trim=0): $data -> $est_mean
# trim_mean(trim=0.2): $data -> $est_mean
```

to indicate we have two different modules, each of which take input
`$data` and output `$est_mean`.

## Defining more complex module parameters

Example goes here.

## Defining module parameters with many alternative values

Example for this section: 

Start with an example in which the different values of n are 10^1, 10^1.5, 10^2, etc, all defined within R().

Then generate many simulated data sets with different values of n, from very small (10) to very large (1e6) using R{}. 

Finally, run the code and show that it works.

## Recap

Add recap here.

## Exploring further

In this tutorial, we introduced the most essential features of DSC that are sufficient to . There are many other features of DSC that we did not have a chance to mention in these introductory tutorial.  