# How is power for different number of effects?

Since simulated effects have certain correlation, we provide `mean_corX` to quantify such correlation. For each fixed number of nonzero effects $T \in \{2,3,5,10,20\}$, we investigate SuSiE power as pve changes. 

## Results

We observe that in the single cell data: 

* As the number of effects increases, it becomes more difficult to achieve high power even if pve is large. But the power change when changing the number of nonzero effects is smaller than that in GTEx data.

* `mean_corX` does not seem to be a factor that causes a change in power. A decline in power is directly related to an increased number of nonzero effects. 

* `mean_corX` is much smaller in single cell data than GTEx data. If this is related to a smaller change in power when varying the number of nonzero effects?

* Furthermore, fdr is not necessaily larger when we have more number of nonzero effects.

In [48]:
singlecell.summary[singlecell.summary$effect_num==2,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
2,2,0.01,0.02261397,0.05,0.1667,1.0
8,2,0.02,0.02261397,0.18,0.0526,1.0
14,2,0.03,0.02261397,0.34,0.0286,1.029412
20,2,0.05,0.02261397,0.53,0.0364,1.170213
26,2,0.1,0.02261397,0.58,0.0333,1.22449
32,2,0.2,0.02261397,0.66,0.0149,1.367347
38,2,0.4,0.02261397,0.73,0.0135,1.510204
44,2,0.5,0.02261397,0.74,0.0133,1.530612
50,2,0.7,0.02261397,0.78,0.0127,1.612245
56,2,0.9,0.02261397,0.86,0.0115,1.77551


In [49]:
singlecell.summary[singlecell.summary$effect_num==3,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
3,3,0.01,0.01847963,0.02,0.25,1.0
9,3,0.02,0.01847963,0.1467,0.0435,1.045455
15,3,0.03,0.01847963,0.22,0.0294,1.030303
21,3,0.05,0.01847963,0.3533,0.0,1.104167
27,3,0.1,0.01847963,0.46,0.0143,1.4
33,3,0.2,0.01847963,0.5467,0.012,1.66
39,3,0.4,0.01847963,0.64,0.0,1.92
45,3,0.5,0.01847963,0.6533,0.0,1.96
51,3,0.7,0.01847963,0.7133,0.0,2.14
57,3,0.9,0.01847963,0.8467,0.0,2.54


In [50]:
singlecell.summary[singlecell.summary$effect_num==5,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
4,5,0.01,0.01622105,0.008,0.0,1.0
10,5,0.02,0.01622105,0.056,0.0,1.0
16,5,0.03,0.01622105,0.124,0.0312,1.066667
22,5,0.05,0.01622105,0.216,0.0,1.2
28,5,0.1,0.01622105,0.308,0.0,1.54
34,5,0.2,0.01622105,0.4,0.0,2.0
40,5,0.4,0.01622105,0.528,0.0,2.64
46,5,0.5,0.01622105,0.564,0.0,2.82
52,5,0.7,0.01622105,0.648,0.0,3.24
58,5,0.9,0.01622105,0.816,0.0,4.08


In [51]:
singlecell.summary[singlecell.summary$effect_num==10,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
5,10,0.01,0.01819252,0.006,0,1.0
11,10,0.02,0.01819252,0.02,0,1.0
17,10,0.03,0.01819252,0.058,0,1.035714
23,10,0.05,0.01819252,0.092,0,1.045455
29,10,0.1,0.01819252,0.144,0,1.44
35,10,0.2,0.01819252,0.226,0,2.26
41,10,0.4,0.01819252,0.312,0,3.12
47,10,0.5,0.01819252,0.356,0,3.56
53,10,0.7,0.01819252,0.468,0,4.68
59,10,0.9,0.01819252,0.636,0,6.36


In [52]:
singlecell.summary[singlecell.summary$effect_num==20,]

Unnamed: 0,effect_num,pve,mean_corX,power,fdr,cs_num
6,20,0.01,0.01834542,0.002,0.3333,1.5
12,20,0.02,0.01834542,0.007,0.125,1.142857
18,20,0.03,0.01834542,0.018,0.0526,1.117647
24,20,0.05,0.01834542,0.039,0.025,1.111111
30,20,0.1,0.01834542,0.072,0.0137,1.520833
36,20,0.2,0.01834542,0.135,0.0074,2.72
42,20,0.4,0.01834542,0.219,0.0,4.38
48,20,0.5,0.01834542,0.246,0.004,4.94
54,20,0.7,0.01834542,0.353,0.0028,7.08
60,20,0.9,0.01834542,0.53,0.0019,10.62


## Code details

In [41]:
singlecell_Q1 = readRDS('singlecell_Q1.rds')
singlecell_Q1 = singlecell_Q1[!is.na(singlecell_Q1$sim_gaussian.output.file),]
singlecell_Q1 = singlecell_Q1[!is.na(singlecell_Q1$susie.output.file),]

In [42]:
singlecell_df = data.frame(singlecell_Q1$sim_gaussian.effect_num, singlecell_Q1$sim_gaussian.pve, 
                       singlecell_Q1$sim_gaussian.mean_corX, singlecell_Q1$score.hit, singlecell_Q1$score.signal_num)
names(singlecell_df) = c('effect_num', 'pve', 'mean_corX', 'hit', 'cs_num')

In [43]:
corX.summary = aggregate(mean_corX ~ effect_num + pve, singlecell_df, mean)
singlecell.summary = corX.summary

In [44]:
meannonzero = function(x){mean(x[x!=0])}
hitmean.summary = aggregate(hit ~ effect_num + pve, singlecell_df, mean)
singlecell.summary$power = round(hitmean.summary$hit / singlecell.summary$effect_num, 4)

In [45]:
hitsum.summary = aggregate(hit ~ effect_num + pve, singlecell_df, sum)
cs_numsum.summary = aggregate(cs_num ~ effect_num + pve, singlecell_df, sum)
singlecell.summary$fdr = round(1 - hitsum.summary$hit / cs_numsum.summary$cs_num, 4)

In [46]:
cs_num.summary = aggregate(cs_num ~ effect_num + pve, singlecell_df, meannonzero)
singlecell.summary$cs_num = cs_num.summary$cs_num

In [47]:
is.nan.data.frame <- function(x)
do.call(cbind, lapply(x, is.nan))
singlecell.summary[is.nan(singlecell.summary)] = 0