# Quick start tutorial

In this tutorial we will use DSC to compare methods implemented in [R](https://cran.r-project.org/) for location parameter estimations, based on this [DSCR example](https://github.com/stephens999/dscr/blob/master/vignettes/one_sample_location.rmd). Material used in this document can be found in [DSC2 vignettes repo](https://github.com/stephenslab/dsc2/tree/master/vignettes/one_sample_location).


## DSC Specification
The DSC problem is to assess location parameter estimation methods via simulation studies. We will simulate data under normal distribution and *t* distribution with 2 degrees of freedom; then estimate the location parameter using mean and median, and finally compare the performance of estimators by computing the difference between the estimate and the underlying parameter. The problem is fully specified in DSC2 language below:

```yaml
  simulate:
      exec: rnorm.R, rt.R
      seed: R(1:10)
      params:
          n: 1000
          true_mean: 0, 1
      return: x, true_mean
  
  estimate:
      exec: mean.R, median.R
      params:
          x: $x
      return: mean
  
  mse:
      exec: MSE.R
      params:
          mean_est: $mean
          true_mean: $true_mean
      return: mse
  
  DSC:
      run: simulate *
           estimate *
           mse
      exec_path: R/scenarios, R/methods, R/scores
      output: dsc_result
```

All computational routines in this DSC are R scripts (each with one line of code!), located in directories as specified in the `DSC::exec_path` entry of the configuration file. Contents of these R scripts are:

```r
  ==> ../vignettes/one_sample_location/R/methods/mean.R <==
  mean = mean(x)
  
  ==> ../vignettes/one_sample_location/R/methods/median.R <==
  mean = median(x)
  
  ==> ../vignettes/one_sample_location/R/scores/MSE.R <==
  mse = (mean_est-true_mean)^2
  
  ==> ../vignettes/one_sample_location/R/scenarios/rt.R <==
  # produces n random numbers from t with df=2 and  with specified mean
  x=true_mean+rt(n,df=2)
  
  ==> ../vignettes/one_sample_location/R/scenarios/rnorm.R <==
  # produces n random numbers from normal with specified mean
  x=rnorm(n,mean=true_mean)
  
```

It is important to ensure the parameter names and return variable names match between R script and DSC files. For example the `simulate` block involves computational routines `rnorm.R` and `rt.R`, both take parameters `n` and `true_mean` and generate a new variable `x` as return value (the other return variable `true_mean` already exists as an input parameter). The R script, `rnorm.R`, is ` x = rnorm(n, mean = true_mean)`, which uses the input parameters `n` and `true_mean` on the right hand side to produce `x` on the left hand side, as the return value of the `simulate` block. The same holds for `rt.R` which is `x = true_mean + rt(n, df = 2)`.

The `DSC::run` entry reflects a typical DSC setup where `simulate` creates *scenarios* under various settings, `estimate` applies various *methods* to analyze data and `mse` is a *score* that measures the performance of different methods. They are connected by `*` to allow for all possible combinations of computational steps from these 3 blocks.

## Run DSC
To execute the DSC on a computer using 8 CPU threads,


In [1]:
! dsc settings.dsc -j8

INFO: DSC script exported to [32msettings.html[0m
INFO: Constructing DSC from [32msettings.dsc[0m ...
INFO: Building output database [32mdsc_result.rds[0m ...
INFO: DSC complete!
INFO: Elapsed time [32m13.047[0m seconds.


In this example the results will be stored in `dsc_result.rds`. These files contain the same information only differing in format. We will discuss the output in detail in [a separate tutorial](). 


## Re-run DSC
DSC keeps track of commands that has been executed before so that if you re-run the same DSC command it will skip computational steps if there is no change in DSC configuration file. For example if you rerun this command it will end quickly:

In [2]:
! dsc settings.dsc -j8

INFO: DSC script exported to [32msettings.html[0m
INFO: Constructing DSC from [32msettings.dsc[0m ...
INFO: Building output database [32mdsc_result.rds[0m ...
INFO: DSC complete!
INFO: Elapsed time [32m2.184[0m seconds.


Notice the last line of output records elapsed time of 2.18 seconds, compared to 13.05 seconds in the first run. If you want to ignore existing cache you can use the `-f` flag to force DSC start afresh:


In [3]:
! dsc settings.dsc -j8 -f

INFO: DSC script exported to [32msettings.html[0m
INFO: Constructing DSC from [32msettings.dsc[0m ...
INFO: Building output database [32mdsc_result.rds[0m ...
INFO: DSC complete!
INFO: Elapsed time [32m12.250[0m seconds.


Results of this DSC is stored in the folder `dsc_result`. It contains numerous intermediate or final data from all steps involved in the DSC procedure. Please continue on the [next tutorial](Explore_Output.html) to extract and analyze the benchmark results.