## SuSiE performance when the number of nonzero effects T = 1, 5, 10, 20
In this vignette, we investigate how SuSiE behaves when the number of nonzero effects in the ground truth is 1, 5, 10, or 20. 

We set default pve (proportion variance explained) = 0.4. Simulation for each combination has 10 replicates. And our simulated `y` is quantitative given GTEx data `X`. 

In [92]:
dscout_Q1 = readRDS('dscout_Q1.rds')
dscout_Q1 = dscout_Q1[!is.na(dscout_Q1$sim_gaussian.output.file),]
dscout_Q1 = dscout_Q1[!is.na(dscout_Q1$susie.output.file),]

In [93]:
dscout_df = data.frame(dscout_Q1$sim_gaussian.effect_num, dscout_Q1$sim_gaussian.pve, dscout_Q1$score.hit, dscout_Q1$score.signal_num, dscout_Q1$score.cs_medianSize, dscout_Q1$score.top_hit)
names(dscout_df) = c('effect_num', 'pve', 'hit', 'signal_num', 'cs_medianSize', 'top_hit')

In [94]:
dscout_df = dscout_df[dscout_df$pve == 0.4,]

* We observe that SuSiE's power is decreasing as the number of nonzero effects gets larger. 

In [95]:
power.summary = aggregate(hit ~ effect_num + pve, dscout_df, sum)
power.summary$power = powersummary$hit / (powersummary$effect_num*10)
power.summary

effect_num,pve,hit,power
1,0.4,10,1.0
5,0.4,14,0.28
10,0.4,17,0.17
20,0.4,20,0.1


* SuSiE has a fairly low FDR rate no matter how many nonzero effects are actually there.  

In [96]:
signal_num.summary = aggregate(signal_num ~ effect_num + pve, dscout_df, sum)
fdr.summary = signal_num.summary
fdr.summary$hit = power.summary$hit
fdr.summary$fdr = round(1 - fdr.summary$hit / fdr.summary$signal_num, 4)
fdr.summary

effect_num,pve,signal_num,hit,fdr
1,0.4,10,10,0.0
5,0.4,15,14,0.0667
10,0.4,18,17,0.0556
20,0.4,22,20,0.0909


* For dataset such as GTEx, confidence sets produced by SuSiE are generally single. i.e, we have only one gene selected by each confidence set. But we also notice that when the number of nonzero effects `T = 10`, the median size is almost 2.

In [97]:
setsize.summary = aggregate(cs_medianSize ~ effect_num + pve, dscout_df, mean)
setsize.summary

effect_num,pve,cs_medianSize
1,0.4,1.0
5,0.4,1.05
10,0.4,1.85
20,0.4,1.0


* SuSiE does not have many top hits on average because the power is not large enough.

In [98]:
tophit.summary = aggregate(top_hit ~ effect_num + pve, dscout_df, mean)
tophit.summary

effect_num,pve,top_hit
1,0.4,1.0
5,0.4,1.4
10,0.4,1.7
20,0.4,2.0
