# How to set a prior given T = 2 nonzero effects?

Assume two nonzero effects $T = 2$. We set $pve = 0.7$ so that SuSiE has certain power around $0.6$. We investigate SuSiE performance under different priors.

## Results

* If we set a large prior, SuSiE has a larger power with 0 fdr. But the cs size is almost single.

* If we set a small prior, SuSiE has a smaller power with 0.0769 fdr. And the confidence set size is larger. And the number of confidence sets are the same for different priors.

To be conservative, we would rather set a large prior to control for fdr. 

In [56]:
dscout.summary[dscout.summary$pve==0.7,]

Unnamed: 0,effect_num,pve,prior,mean_corX,power,fdr,cs_size,cs_num,top_hit_rate
50,2,0.7,0.01,0.3371353,0.6,0.0769,1.2,1.3,0.9231
116,2,0.7,0.02,0.3371353,0.6,0.0769,1.35,1.3,0.9231
182,2,0.7,0.03,0.3371353,0.6,0.0769,1.45,1.3,0.9231
248,2,0.7,0.05,0.3371353,0.6,0.0769,1.6,1.3,0.9231
314,2,0.7,0.1,0.3371353,0.6,0.0,1.0,1.2,1.0
380,2,0.7,0.2,0.3371353,0.65,0.0,3.7,1.3,0.9231
446,2,0.7,0.4,0.3371353,0.65,0.0,3.3,1.3,0.9231
512,2,0.7,0.5,0.3371353,0.65,0.0,3.25,1.3,0.9231
578,2,0.7,0.7,0.3371353,0.65,0.0,3.2,1.3,0.9231
644,2,0.7,0.9,0.3371353,0.65,0.0,3.2,1.3,0.9231


## Code details

In [57]:
dscout_Q2 = readRDS('dscout_Q2.rds')
dscout_Q2 = dscout_Q2[!is.na(dscout_Q2$sim_gaussian.output.file),]
dscout_Q2 = dscout_Q2[!is.na(dscout_Q2$susie_prior.output.file),]

In [58]:
dscout_df = data.frame(dscout_Q2$sim_gaussian.effect_num, dscout_Q2$sim_gaussian.pve, dscout_Q2$susie_prior.prior,
                       dscout_Q2$score.hit, dscout_Q2$score.signal_num, dscout_Q2$score.cs_medianSize,
                       dscout_Q2$score.top_hit, dscout_Q2$sim_gaussian.mean_corX, dscout_Q2$susie_prior.avg_purity)
names(dscout_df) = c('effect_num', 'pve', 'prior','hit', 'cs_num', 'cs_size', 'top_hit', 'corX', 'avg_purity')

In [59]:
power.summary = aggregate(hit ~ effect_num + pve + prior, dscout_df, sum)
power.summary$power = power.summary$hit / (power.summary$effect_num*10)
fdr.summary = aggregate(cs_num ~ effect_num + pve + prior, dscout_df, sum)
fdr.summary$fdr = round(1 - power.summary$hit / fdr.summary$cs_num, 4)
meannonzero = function(x){mean(x[x!=0])}
setsize.summary = aggregate(cs_size ~ effect_num + pve + prior, dscout_df, meannonzero)
tophit.summary = aggregate(top_hit ~ effect_num + pve + prior, dscout_df, sum)
tophit.summary$tophit_rate = round(tophit.summary$top_hit / fdr.summary$cs_num , 4)
corX.summary = aggregate(corX ~ effect_num + pve + prior, dscout_df, mean)
cs_num.summary = aggregate(cs_num ~ effect_num + pve + prior, dscout_df, mean)
#purity.summary = aggregate(avg_purity ~ effect_num + pve + prior, dscout_df, mean)

In [60]:
dscout.summary = data.frame(power.summary$effect_num, power.summary$pve, power.summary$prior, corX.summary$corX,
                            power.summary$power, fdr.summary$fdr, setsize.summary$cs_size, 
                            cs_num.summary$cs_num, tophit.summary$tophit_rate)
names(dscout.summary) = c('effect_num', 'pve', 'prior', 'mean_corX','power', 
                          'fdr', 'cs_size', 'cs_num','top_hit_rate')

In [61]:
dscout.summary = dscout.summary[dscout.summary$effect_num==2, ]