# Running "degenerated" MASH computation 

The goal is to verify implementation of mash computation is correct, by comparing it to univariate case when $Y$ has one column and prior covariance matrices is fixed. It should later become part of a unit test.

## Simulate data

In [1]:
set.seed(1)
n = 1000
p = 1000
beta = rep(0,p)
beta[1:4] = 1
X = matrix(rnorm(n*p),nrow=n,ncol=p)
y = X %*% beta + rnorm(n)
#' res =susie(X,y,L=10)
library(mmbr)

## Run univariate computation

In [2]:
V = 0.2 * as.numeric(var(y))
residual_var = as.numeric(var(y))
data = mmbr:::DenseData$new(X,y)

In [3]:
m1 = mmbr:::BayesianMultipleRegression$new(ncol(X), residual_var, V)
m1$fit(data, save_summary_stats = T)

In [4]:
head(m1$posterior_b1)

0
1.17582479
1.10571117
1.11985691
1.0762921
-0.12724319
-0.07497652


In [5]:
m1

<BayesianMultipleRegression>
  Public:
    bhat: active binding
    clone: function (deep = FALSE) 
    compute_loglik_null: function (d) 
    fit: function (d, prior_weights = NULL, use_residual = FALSE, save_summary_stats = FALSE) 
    initialize: function (J, residual_variance, prior_variance, estimate_prior_variance = FALSE) 
    loglik_null: active binding
    posterior_b1: active binding
    posterior_b2: active binding
    prior_variance: active binding
    residual_variance: active binding
    sbhat: active binding
  Private:
    .bhat: 1.18170980260002 1.11124526041777 1.12546180389929 1.081 ...
    denied: function (v) 
    estimate_prior_variance: FALSE
    J: 1000
    .lbf: 118.091157635422 104.120913660333 106.870332723677 98.51 ...
    .loglik_null: NULL
    .posterior_b1: 1.17582479362293 1.10571117047545 1.11985691443764 1.076 ...
    .posterior_b2: 1.38828921226831 1.2283224594841 1.25980477578369 1.1641 ...
    .prior_variance: 1.14963360755715
    .residual_variance:

## Run multivariate computation

Assuming 1 out of $J$ are causal, we place a null weight $1-1/J$ a priori.

In [6]:
V = 0.2 * cov(y)
null_weight = 1 - 1 / ncol(X)
V = mmbr:::MashPrior$new(list(V), 1, 1 - null_weight, null_weight)
residual_covar = cov(y)

In [7]:
V$dump()

0
1.149634


In [8]:
m2 = mmbr:::MashMultipleRegression$new(ncol(X), residual_covar, V)

In [9]:
m2$fit(data, save_summary_stats = T)

In [10]:
head(m2$posterior_b1)

0
39.5700385486
37.2078510316
37.6849699315
36.2133463415
-0.0009113286
-0.0004590404


In [11]:
m2

<MashMultipleRegression>
  Inherits from: <BayesianMultipleRegression>
  Public:
    bhat: active binding
    clone: function (deep = FALSE) 
    compute_loglik_null: function (d) 
    fit: function (d, prior_weights = NULL, use_residual = FALSE, save_summary_stats = FALSE) 
    initialize: function (J, residual_variance, prior_variance, estimate_prior_variance = FALSE) 
    loglik_null: active binding
    posterior_b1: active binding
    posterior_b2: active binding
    prior_variance: active binding
    residual_variance: active binding
    sbhat: active binding
  Private:
    .bhat: 1.18170980260002 1.11124526041777 1.12546180389929 1.081 ...
    denied: function (v) 
    estimate_prior_variance: FALSE
    J: 1000
    .lbf: 8.13508763081477 7.10396826283781 7.30689809369508 6.690 ...
    .loglik_null: -20.3249296191687 -17.8823835725305 -18.363089756023 -16 ...
    .posterior_b1: 39.5700385486009 37.2078510316375 37.6849699314555 36.21 ...
    .posterior_b2: 1566.90692500336 1385.64

## Run with fixed prior directly from MASH 

In [12]:
library(mashr)
data = mash_set_data(m2$bhat, m2$sbhat, V=residual_covar)
m.c = mash(data, g = V$dump())

Loading required package: ashr


 - Computing 1000 x 2 likelihood matrix.
 - Likelihood calculations took 0.02 seconds.
 - Fitting model with 2 mixture components.
 - Model fitting took 0.07 seconds.
 - Computing posterior matrices.
 - Computation allocated took 0.03 seconds.


In [13]:
m.c$fitted_g

0
1.149634


In [14]:
head(m.c$result$PosteriorMean)

0
39.570268603
37.210174898
37.686445018
36.219210535
-0.004426808
-0.002230069


## Run from MASH with canonical priors but weights learned from data 

In [15]:
U.c = cov_canonical(data)
print(names(U.c))

[1] "identity"      "singletons_1"  "equal_effects" "simple_het_1" 
[5] "simple_het_2"  "simple_het_3" 


In [16]:
m.c = mash(data, U.c)

 - Computing 1000 x 109 likelihood matrix.
 - Likelihood calculations took 0.01 seconds.
 - Fitting model with 109 mixture components.
 - Model fitting took 0.09 seconds.
 - Computing posterior matrices.
 - Computation allocated took 0.00 seconds.


In [17]:
head(m.c$result$PosteriorMean)

0
47.502352829
44.669148259
45.240896883
43.479504203
-0.004756322
-0.0023942


In [18]:
m.c$fitted_g

0
1

0
1

0
1

0
1

0
1

0
1
