# Multiple DSC pipelines

In this tutorial we further expand benchmark of the DSC problem described in [the Quick Start tutorial](Quick_Start.html), to demonstrate the use of multiple pipelines (pipeline ensembles) in DSC. Material used in this document can be found in [DSC2 vignettes repo](https://github.com/stephenslab/dsc2/tree/master/vignettes/one_sample_location_winsor).

## Configuration
The DSC problem is similar to what we have previously worked on, i.e. comparison of location parameter estimation methods. This time we simulate data under *t* distribution (df = 2) and Cauchy distribution. Then before estimating location parameter using mean or median method, there is an **optional** `transform` step where we provide two methods for *Winsorization*. This results in two DSC pipelines:

*  simulate -> estimate -> score
*  simulate -> transform -> estimate -> score

The DSC problem is fully specified as:

```
rt, rcauchy: rt.R, rcauchy.R
    seed: R(1:5)
    n: 1000
    true_loc: 0, 1
    $x: x
    $true_loc: true_loc

winsor1, winsor2: winsor1.R, winsor2.R
    x: $x
    @winsor1:
      fraction: 0.05
    @winsor2:
      multiple: 3
    $x: x

mean, median: mean.R, median.R
    x: $x
    $loc: loc

mse: MSE.R
    mean_est: $loc
    true_mean: $true_loc
    $mse: mse

DSC:
    define:
      simulate: rt, rcauchy
      transform: winsor1, winsor2
      estimate: mean, median
    run: simulate *
         (transform * estimate, estimate) *
         mse
    exec_path: R
    output: dsc_result
```

where `transform` module ensemble contains:

```r
  ==> ../vignettes/one_sample_location_winsor/R/winsor1.R <==
  ##  replace the extreme values with limits
  winsor1 <- function (x, fraction=.05)
  {
     if(length(fraction) != 1 || fraction < 0 ||
           fraction > 0.5) {
        stop("bad value for 'fraction'")
     }
     lim <- quantile(x, probs=c(fraction, 1-fraction))
     x[ x < lim[1] ] <- lim[1]
     x[ x > lim[2] ] <- lim[2]
     return(x)
  }
  x = winsor1(x, fraction)

  ==> ../vignettes/one_sample_location_winsor/R/winsor2.R <==
  ## move the datapoints that are x times the absolute deviations from mean
  winsor2 <- function (x, multiple=3)
  {
     if(length(multiple) != 1 || multiple <= 0) {
        stop("bad value for 'multiple'")
     }
     med <- median(x)
     y <- x - med
     sc <- mad(y, center=0) * multiple
     y[ y > sc ] <- sc
     y[ y < -sc ] <- -sc
     return(y + med)
  }
  x = winsor2(x, multiple)

```

As a result, the previous `estimate` step is now a pipeline ensemble of `(transform * estimate, estimate)`.

## Execution

To run the benchmark

In [1]:
%cd ~/GIT/dsc2/vignettes/one_sample_location_winsor

/home/gaow/GIT/dsc2/vignettes/one_sample_location_winsor

In [2]:
! dsc settings.dsc -c 30

[1;32mINFO: Checking R library dscrutils@stephenslab/dsc2/dscrutils ...[0m
INFO: DSC script exported to [32mdsc_result.html[0m
INFO: Constructing DSC from [32msettings.dsc[0m ...
INFO: Building execution graph & running DSC ...
DSC: 100%|██████████████████████████████████████| 31/31 [00:15<00:00,  1.51it/s]
INFO: Building DSC database ...
INFO: DSC complete!
INFO: Elapsed time [32m19.693[0m seconds.
