# How is power for different numbers of non-zero effects?

Since simulated effects have certain correlation, we provide `mean_corX` to quantify such correlation. Assume prior is 0.2. For each fixed number of non-zero effects $T \in \{2,3,5,10,20\}$, we investigate SuSiE power as PVE changes. 

## Results

**- Summary: Correlations in the GSEA are similar to that in the single cell data. SuSiE power decreases and FDR increases given more non-zero effects, despite of correlations among columns of X.**


* As the number of effects increases, it becomes more difficult to achieve high power even if PVE is large.

* `mean_corX` does not seem to be a factor that causes a change in power. A decline in power is directly related to an increased number of non-zero effects. 

* Furthermore, FDR also becomes larger when we have more number of non-zero effects.

In [8]:
dscout.summary[dscout.summary$effect_num==2,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
2,2,0.01,0.01603345,0.03,0.0,1.0
8,2,0.02,0.01603345,0.17,0.0,1.0
14,2,0.03,0.01603345,0.33,0.0,1.137931
20,2,0.05,0.01603345,0.5,0.0385,1.181818
26,2,0.1,0.01603345,0.62,0.0312,1.333333
32,2,0.2,0.01603345,0.69,0.0,1.408163
38,2,0.4,0.01603345,0.77,0.0,1.571429
44,2,0.5,0.01603345,0.78,0.0,1.591837
50,2,0.7,0.01603345,0.8,0.0,1.632653
56,2,0.9,0.01603345,0.89,0.0,1.78


In [9]:
dscout.summary[dscout.summary$effect_num==3,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
3,3,0.01,0.0265014,0.0133,0.5,1.0
9,3,0.02,0.0265014,0.0733,0.1538,1.0
15,3,0.03,0.0265014,0.18,0.129,1.033333
21,3,0.05,0.0265014,0.3333,0.0741,1.173913
27,3,0.1,0.0265014,0.4867,0.0267,1.5
33,3,0.2,0.0265014,0.5867,0.033,1.82
39,3,0.4,0.0265014,0.72,0.0182,2.2
45,3,0.5,0.0265014,0.76,0.0,2.28
51,3,0.7,0.0265014,0.82,0.0,2.46
57,3,0.9,0.0265014,0.8467,0.0155,2.58


In [10]:
dscout.summary[dscout.summary$effect_num==5,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
4,5,0.01,0.0230124,0.008,0.0,1.0
10,5,0.02,0.0230124,0.064,0.1111,1.0
16,5,0.03,0.0230124,0.084,0.087,1.045455
22,5,0.05,0.0230124,0.188,0.0408,1.225
28,5,0.1,0.0230124,0.356,0.0632,1.9
34,5,0.2,0.0230124,0.532,0.0432,2.78
40,5,0.4,0.0230124,0.596,0.0745,3.22
46,5,0.5,0.0230124,0.656,0.0838,3.58
52,5,0.7,0.0230124,0.736,0.0366,3.82
58,5,0.9,0.0230124,0.844,0.0365,4.38


In [11]:
dscout.summary[dscout.summary$effect_num==10,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
5,10,0.01,0.02950683,0.006,0.25,1.0
11,10,0.02,0.02950683,0.02,0.0909,1.0
17,10,0.03,0.02950683,0.042,0.125,1.0
23,10,0.05,0.02950683,0.08,0.1111,1.153846
29,10,0.1,0.02950683,0.158,0.0814,1.755102
35,10,0.2,0.02950683,0.254,0.0593,2.7
41,10,0.4,0.02950683,0.386,0.0721,4.16
47,10,0.5,0.02950683,0.434,0.0921,4.78
53,10,0.7,0.02950683,0.552,0.0738,5.96
59,10,0.9,0.02950683,0.686,0.0926,7.56


In [12]:
dscout.summary[dscout.summary$effect_num==20,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
6,20,0.01,0.02397695,0.001,0.0,1.0
12,20,0.02,0.02397695,0.003,0.0,1.0
18,20,0.03,0.02397695,0.009,0.1,1.0
24,20,0.05,0.02397695,0.026,0.037,1.125
30,20,0.1,0.02397695,0.076,0.038,1.612245
36,20,0.2,0.02397695,0.158,0.0482,3.32
42,20,0.4,0.02397695,0.263,0.0772,5.7
48,20,0.5,0.02397695,0.303,0.0901,6.66
54,20,0.7,0.02397695,0.427,0.1011,9.5
60,20,0.9,0.02397695,0.582,0.1168,13.18


## Code details

In [1]:
dscout_Q1 = readRDS('gsea_Q1.rds')
dscout_Q1 = dscout_Q1[!is.na(dscout_Q1$sim_gaussian.output.file),]
dscout_Q1 = dscout_Q1[!is.na(dscout_Q1$susie.output.file),]

In [2]:
dscout_df = data.frame(dscout_Q1$sim_gaussian.effect_num, dscout_Q1$sim_gaussian.pve, 
                       dscout_Q1$sim_gaussian.mean_corX,dscout_Q1$score.hit, dscout_Q1$score.signal_num)
names(dscout_df) = c('effect_num', 'pve', 'mean_corX', 'hit', 'cs_num')

In [3]:
corX.summary = aggregate(mean_corX ~ effect_num + pve, dscout_df, mean)
dscout.summary = corX.summary

In [4]:
meannonzero = function(x){mean(x[x!=0])}
hitmean.summary = aggregate(hit ~ effect_num + pve, dscout_df, mean)
dscout.summary$power = round(hitmean.summary$hit / dscout.summary$effect_num, 4)

In [5]:
hitsum.summary = aggregate(hit ~ effect_num + pve, dscout_df, sum)
cs_numsum.summary = aggregate(cs_num ~ effect_num + pve, dscout_df, sum)
dscout.summary$fdr = round(1 - hitsum.summary$hit / cs_numsum.summary$cs_num, 4)

In [6]:
cs_num.summary = aggregate(cs_num ~ effect_num + pve, dscout_df, meannonzero)
dscout.summary$cs_num = cs_num.summary$cs_num

In [7]:
is.nan.data.frame <- function(x)
do.call(cbind, lapply(x, is.nan))
dscout.summary[is.nan(dscout.summary)] = 0