# Power comparison susie vs DAP

Here we compare power of susie and DAP under different number of simulation signals for fixed PVE.

We set `n_causal` ranging from 1~5, each simulating 100 data-sets from 50 genes. For each simulated data, we compute both susie 95% CS and DAP 95% cluster, and we evaluate for the CS or clusters reported how many of them capture **at least a signal**.

Power is defined by the proportion of signals captured by susie CS or DAP clusters; false discovery proportion (fdp) is defined by the proportion of susie CS or DAP clusters that do not contain any signal.

The `power` workflow in [this notebook](20180527_PIP_Workflow.html) does the computation.

In [1]:
%cd ~/GIT/github/mvarbvs/dsc

/home/gaow/GIT/github/mvarbvs/dsc

## susie `var(Y)` vs DAP

In [7]:
readRDS('susie_comparison/Power_comparison_0528_cluster_prob_estvar_false.rds')

n_signal,expected_discoveries,susie_discoveries,dap_discoveries,susie_power,dap_power,susie_fdp,dap_fdp
1,100,100,106,0.99,0.95,0.01,0.10377358
2,200,145,147,0.7,0.7,0.04137931,0.04761905
3,300,157,160,0.5166667,0.5066667,0.02547771,0.0625
4,400,154,148,0.36,0.325,0.08441558,0.12837838
5,500,151,144,0.3,0.262,0.0397351,0.125


## susie `est_var` vs DAP

In [8]:
readRDS('susie_comparison/Power_comparison_0528_cluster_prob_estvar_true.rds')

n_signal,expected_discoveries,susie_discoveries,dap_discoveries,susie_power,dap_power,susie_fdp,dap_fdp
1,100,100,106,0.99,0.95,0.01,0.10377358
2,200,155,147,0.725,0.7,0.06451613,0.04761905
3,300,170,160,0.5466667,0.5066667,0.03529412,0.0625
4,400,172,148,0.3925,0.325,0.10465116,0.12837838
5,500,169,144,0.316,0.262,0.08284024,0.125


## Signals that susie captures but DAP misses, for n = 1 case

This is result of running `power` of [this notebook](20180527_PIP_Workflow.html). A typical case is that the set is large yet highly correlated, eg:

```
susie:
  [1] 335 531 532 533 536 538 539 540 541 542 543 544 546 547 548 549 550 551
 [19] 552 553 554 556 558 559 560 562 563 564 565 566 567 568 569 570 571 574
 [37] 575 576 577 578 579 581 585 586 587 588 591 592 593 594 597 598 601 602
 [55] 603 608 609 610 611 612 613 614 616 617 618 620 621 623 626 628 629 630
 [73] 633 634 635 639 641 642 645 646 647 648 650 651 652 653 654 655 656 657
 [91] 659 660 661 662 663 673 675 676 678 679 681 685 686 687 688 689 690 691
[109] 692 696 698 699 700 701 702 703 704 705 706 707 708 709 710 714 715 719
[127] 720 721 722 723 726 730 732 733 735 736 738 744 745 746

DAP:
list()
  cluster cluster_prob cluster_avg_r2
0       1     0.398200          0.992
1       2     0.006435          0.749
                                                                                                                                                                      snp
0 700,648,646,543,673,581,721,738,560,586,714,701,662,735,618,696,736,621,702,623,587,733,679,732,726,710,708,678,692,691,690,689,688,652,635,639,641,633,629,617,616,620
1                                                                                                                                                             152,361,515
[1] "~/Documents/GTExV8/Toys/Thyroid.ENSG00000083937.RDS"
```
So susie's CS is quite large. For the first cluster DAP comes up with, the avg r2 is high, and overlaps with susie CS, but the `cluster_prob` it computes is low. But indeed that cluster does contain the causal variable.

## DAP false discovery, for n = 1 case

Mostly I see are physically close signals falling in different DAP clusters, eg:

```
susie (and the truth)
[1] 765

DAP
[[1]]
[1] 765

[[2]]
[1] 786

  cluster cluster_prob cluster_avg_r2
0       1    0.9998000          1.000
1       2    0.9995000          1.000
...
[1] "~/Documents/GTExV8/Toys/Thyroid.ENSG00000031823.RDS"
```

and a more extreme case:

```
susie (and the truth)
[1] 669

DAP
[[1]]
[1] 669

[[2]]
 [1] 452 724 717 480 666 665 649 638 632 622 499 524 747 742 734 712 711 693 670
[20] 671 643 584 514 495 504 520 516 517 511 507 501 498 485 486 490 530 453

  cluster cluster_prob cluster_avg_r2
0       1    1.0000000          1.000
1       2    0.9973000          0.984
...
[1] "~/Documents/GTExV8/Toys/Thyroid.ENSG00000083937.RDS"
```

where the 2nd cluster has some variables (eg 665 and 666) very near the first cluster's only variable (669). But they ended up in a different cluster.