# Same example, different style: ashr benchmark

This is another implementation of the [ashr benchmark](Intermediate_R_1.html). In the previous implementation, we had to prepare a separate annotation file to tag simulation scenarios and analysis methods. This reproduces the approach in [ashr DSCR example](https://github.com/stephens999/dscr/blob/master/vignettes/dsc_shrink.rmd) (that steps are tagged separatedly) but it has obvious draw-backs under DSC2 design:
* It requires a separate annotation file
* It requires using overlapping tags (`--tags An && ash_n`) to extract results.

In this tutorial we demonstrate another style of `ashr` benchmark execution: the configuration is not very much different, yet we use named run sequences and automatic annotation in DSC2.

## DSC2 Specification
```
simulate:
    exec: datamaker.R
    .alias: An, Bn, Cn
    seed: R(1:5)
    params:
        min_pi0: 0
        max_pi0: 1
        nsamp: 1000
        betahatsd: 1
        exec[1]:
          g: Asis(ashr::normalmix(c(2/3,1/3),c(0,0),c(1,2))),
        exec[2]:
          g: Asis(ashr::normalmix(rep(1/7,7),
          c(-1.5,-1,-0.5,0,0.5,1,1.5),rep(0.5,7))),
        exec[3]:
          g: Asis(ashr::normalmix(c(1/4,1/4,1/3,1/6),
          c(-2,-1,0,1),c(2,1.5,1,1)))
        .alias: args = List()
    return: data, true_beta = R(data$meta$beta), 
            true_pi0 = R(data$meta$pi0)

shrink:
    exec: runash.R
    .alias: ash_n, ash_hu
    params:
        input: $data
        exec[1]:
          mixcompdist: normal
        exec[2]:
          mixcompdist: halfuniform
    return: ash_data, beta_est = R(ashr::get_pm(ash_data)),
            pi0_est = R(ashr::get_pi0(ash_data))

beta_score:
    exec: score.R
    .alias: score_beta
    params:
        beta_true: $true_beta
        beta_est: $beta_est
        .alias: est = beta_est, truth = beta_true
    return: result

pi0_score(beta_score):
    .alias: score_pi0
    params:
        pi0_est: $pi0_est
        pi0: $true_pi0
        .alias: est = pi0_est, truth = pi0

DSC:
    run:
      An_ash_n: simulate[1] * shrink[1] * (beta_score, pi0_score)
      An_ash_hu: simulate[1] * shrink[2] * (beta_score, pi0_score)
      Bn_ash_n: simulate[2] * shrink[1] * (beta_score, pi0_score)
      Bn_ash_hu: simulate[2] * shrink[2] * (beta_score, pi0_score)
      Cn_ash_n: simulate[3] * shrink[1] * (beta_score, pi0_score)
      Cn_ash_hu: simulate[3] * shrink[2] * (beta_score, pi0_score)
    R_libs: stephens999/ashr (2.0.0+)
    exec_path: bin
    output: dsc_result_auto
```

Compared to the previous example, notice that 
1. Computational routine `datamaker.R` in `simulate` block is splitted into 3 routines, via `.alias` that renames them to distinguish from each other, and each takes a different `g` value. This essentially makes it 3 different methods.
2. Similiarly in `shrink` there are two methods now.
3. In `DSC::run`, instead of one sequence there are now multiple sub-sequences each with a name. Combined they are equivalent to the one sequence setup `simulate * shrink * (beta_score, pi0_score)`.

## DSC annotation
The DSC annotation is now very simple:
```
DSC:
  configuration: settings_autotag.dsc
```
It only has one required section `DSC` where it specifies the configuration file name. When there is no other tags in an annotation file, DSC will attempt to automatically annotate the DSC based on how the benchmark is executed, ie, the `DSC::run` logic.

## Reproducing [previous](Intermediate_R_1.html) results using simpler commands

In [1]:
! dsc -x settings_autotag.dsc -j 8 --seeds "R(1:50)" -a settings_autotag.ann

INFO: Checking R library [32mstephens999/ashr[0m ...
INFO: DSC script exported to [32mdsc_result_auto.html[0m
INFO: Constructing DSC from [32msettings_autotag.dsc[0m ...
INFO: Building execution graph ...
DSC:   0%|          | 0/22 [00:00<?, ?it/s]DSC:   5%|▍         | 1/22 [00:14<05:13, 14.92s/it]DSC:   9%|▉         | 2/22 [00:15<03:29, 10.47s/it]Running shrink: Running shrink: Running shrink: Running shrink: Running shrink: Running shrink: Running core_shrink_2 (ash_hu): Running core_shrink_1 (ash_n): Running core_shrink_2 (ash_hu): Running core_shrink_1 (ash_n): Running core_shrink_1 (ash_n): Running core_shrink_2 (ash_hu): Running shrink (00:02:00): Running shrink (00:02:00): Running shrink (00:02:00): Running shrink (00:02:00): Running shrink (00:02:00): Running shrink (00:02:00): Running core_shrink_2 (ash_hu) (00:02:00): Running core_shrink_1 (ash_n) (00:02:00): Running core_shrink_2 (ash_hu) (00:02:00): Running core_shrink_1 (ash_n) (00:02:00): R

Here we execute and annotate the benchmark using one command. From the annotation summary table we see that tags created corresponds to the DSC sequences specified in `DSC::run` of DSC configuration file. Each sequence is annotated with 50 results in each block, corresponding to our seed setting `--seeds "R(1:50)"`.

## Reproducing results exploration

In [2]:
! dsc -e pi0_score:result --target pi0_score -o ashr_pi0_2.rds \
    --tags "case1 = An_ash_n" "case2 = An_ash_hu" -b dsc_result_auto
! dsc -e beta_score:result shrink:beta_est --target beta_score -o ashr_beta_2.rds \
    --tags "case1 = An_ash_n" "case2 = An_ash_hu" -b dsc_result_auto

Extracting: 100%|██████████| 3/3 [00:00<00:00,  5.62it/s]
INFO: Data extracted to [32mashr_pi0_2.rds[0m for DSC result [32mpi0_score[0m via annotations: 
	[32mcase1 = An_ash_n
	case2 = An_ash_hu[0m
INFO: Elapsed time [32m0.948[0m seconds.
Extracting: 100%|██████████| 5/5 [00:00<00:00, 12.16it/s]
INFO: Data extracted to [32mashr_beta_2.rds[0m for DSC result [32mbeta_score[0m via annotations: 
	[32mcase1 = An_ash_n
	case2 = An_ash_hu[0m
INFO: Elapsed time [32m1.292[0m seconds.


and the plots:

In [4]:
%use ir
options(warn=-1)
dat = readRDS('ashr_pi0_2.rds')
case1 = unlist(dat$case1_pi0_score_result)
case2 = unlist(dat$case2_pi0_score_result)
suppressMessages(library(plotly))
p = plot_ly(y = case1, name = 'case 1', type = "box") %>%
  add_trace(y = case2, name = 'case 2', type = "box")  %>% 
  layout(title = "MSE for pi_0 estimate")
htmlwidgets::saveWidget(as.widget(p), "pi0_score_2.html")
#
dat = readRDS('ashr_beta_2.rds')
case1 = unlist(dat$case1_beta_score_result)
case2 = unlist(dat$case2_beta_score_result)
case1_beta = rowMeans(data.frame(dat$case1_shrink_beta_est))
case2_beta = rowMeans(data.frame(dat$case2_shrink_beta_est))
#
p = plot_ly(y = case1, name = 'case 1', type = "box") %>%
  add_trace(y = case2, name = 'case 2', type = "box")  %>% 
  layout(title = "MSE for beta estimate")
htmlwidgets::saveWidget(as.widget(p), "beta_score_2.html")
#
p = plot_ly(x = case1_beta, name = 'case 1', opacity = 0.9, type = "histogram") %>%
  add_trace(x = case2_beta, name = 'case 2', opacity = 0.9, type = "histogram") %>%
  layout(title = "Posterior mean distribution")
htmlwidgets::saveWidget(as.widget(p), "beta_2.html")
#
case1 = unlist(dat$DSC_TIMER$case1_shrink_beta_est)
case2 = unlist(dat$DSC_TIMER$case2_shrink_beta_est)
#
p = plot_ly(y = case1, name = 'case 1', type = "box") %>%
  add_trace(y = case2, name = 'case 2', type = "box")  %>% 
  layout(title = "Time elapsed for posterior mean estimation")
htmlwidgets::saveWidget(as.widget(p), "beta_time_2.html")

The results are:
* [MSE for $\pi_0$ estimate](pi0_score_2.html)
* [MSE for $\beta$ estimate](beta_score_2.html)
* [Posterior mean distribution](beta_2.html)
* [Time elapsed for posterior mean estimation](beta_time_2.html)