# Running "degenerated" MASH computation 

The goal is to verify implementation of mash computation is correct, by comparing it to univariate case when $Y$ has one column and prior covariance matrices is fixed. It should later become part of a unit test.

## Simulate data

In [1]:
set.seed(1)
n = 1000
p = 1000
beta = rep(0,p)
beta[1:4] = 1
X = matrix(rnorm(n*p),nrow=n,ncol=p)
y = X %*% beta + rnorm(n)
#' res =susie(X,y,L=10)
library(mmbr)

Loading required package: mashr
Loading required package: ashr


## Run univariate computation

In [2]:
prior_var = 0.2 * as.numeric(var(y))
residual_var = as.numeric(var(y))
data = mmbr:::DenseData$new(X,y)

In [3]:
residual_var

In [4]:
prior_var

In [5]:
m1 = mmbr:::BayesianMultipleRegression$new(ncol(X), residual_var, prior_var)
m1$fit(data, save_summary_stats = T)

In [6]:
head(m1$posterior_b1)

0
1.17582479
1.10571117
1.11985691
1.0762921
-0.12724319
-0.07497652


In [7]:
m1

<BayesianMultipleRegression>
  Public:
    bhat: active binding
    clone: function (deep = FALSE) 
    compute_loglik_null: function (d) 
    fit: function (d, prior_weights = NULL, use_residual = FALSE, save_summary_stats = FALSE) 
    initialize: function (J, residual_variance, prior_variance, estimate_prior_variance = FALSE) 
    lbf: active binding
    loglik_null: active binding
    posterior_b1: active binding
    posterior_b2: active binding
    prior_variance: active binding
    residual_variance: active binding
    sbhat: active binding
  Private:
    .bhat: 1.18170980260002 1.11124526041777 1.12546180389929 1.081 ...
    .lbf: 118.091157635422 104.120913660333 106.870332723677 98.51 ...
    .loglik_null: NULL
    .posterior_b1: 1.17582479362293 1.10571117047545 1.11985691443764 1.076 ...
    .posterior_b2: 1.38828921226831 1.2283224594841 1.25980477578369 1.1641 ...
    .prior_variance: 1.14963360755715
    .residual_variance: 5.74816803778574
    .sbhat: 0.0758546106689995 

## Run multivariate computation

In [8]:
# Assuming 1 out of $J$ are causal, we place a null weight $1-1/J$ a priori.
# This will lead to some shrinkage
# null_weight = 1 - 1 / ncol(X)
null_weight = 0
prior_covar = mmbr:::MashInitializer$new(list(0.2 * cov(y)), 1, 1 - null_weight, null_weight)
residual_covar = cov(y)

In [9]:
prior_covar$prior_covariance

In [10]:
residual_covar

0
5.748168


In [11]:
m2 = mmbr:::MashMultipleRegression$new(ncol(X), residual_covar, prior_covar)

In [12]:
m2$fit(data, save_summary_stats = T)

In [13]:
head(m2$posterior_b1)

0
1.17582479
1.10571117
1.11985691
1.0762921
-0.12724319
-0.07497652


In [14]:
m2

<MashMultipleRegression>
  Inherits from: <BayesianMultipleRegression>
  Public:
    bhat: active binding
    clone: function (deep = FALSE) 
    compute_loglik_null: function (d) 
    fit: function (d, prior_weights = NULL, use_residual = FALSE, save_summary_stats = FALSE) 
    initialize: function (J, residual_variance, mash_initializer, estimate_prior_variance = FALSE) 
    lbf: active binding
    loglik_null: active binding
    posterior_b1: active binding
    posterior_b2: active binding
    prior_variance: active binding
    residual_variance: active binding
    sbhat: active binding
  Private:
    .bhat: 1.18170980260002 1.11124526041777 1.12546180389929 1.081 ...
    .lbf: 118.091157635422 104.120913660333 106.870332723677 98.51 ...
    .loglik_null: -119.686629951695 -105.64646483559 -108.409644755107 -10 ...
    .posterior_b1: 1.17582479362293 1.10571117047545 1.11985691443764 1.076 ...
    .posterior_b2: 1.38828921226831 1.2283224594841 1.25980477578369 1.1641 ...
    .prior

All quantities seem to agree now.

## Run ASH to confirm

ASH works well.

In [15]:
library(ashr)

In [16]:
a.out = ash(as.vector(m1$bhat), as.vector(m1$sbhat), mixcompdist = 'normal')
head(get_pm(a.out))

## Run with fixed prior directly from MASH

In [17]:
prior_covar$mash_prior

0
1.149634


In [18]:
library(mashr)

In [19]:
data = mash_set_data(m2$bhat, m2$sbhat)
m.c = mash(data, g = prior_covar$mash_prior, fixg = TRUE, algorithm ='Rcpp')

 - Computing 1000 x 2 likelihood matrix.
 - Likelihood calculations took 0.00 seconds.
 - Computing posterior matrices.
 - Computation allocated took 0.00 seconds.


In [20]:
m.c$fitted_g

0
1.149634


In [21]:
head(get_pm(m.c))

0
1.17582479
1.10571117
1.11985691
1.0762921
-0.12724319
-0.07497652


## Run from MASH with canonical priors but weights learned from data 

Very similiar results to what I got with fixed `g` earlier.

In [22]:
U.c = cov_canonical(data)
print(names(U.c))

[1] "identity"      "singletons_1"  "equal_effects" "simple_het_1" 
[5] "simple_het_2"  "simple_het_3" 


In [23]:
m.c = mash(data, U.c, , algorithm ='Rcpp')

 - Computing 1000 x 109 likelihood matrix.
 - Likelihood calculations took 0.01 seconds.
 - Fitting model with 109 mixture components.
 - Model fitting took 0.05 seconds.
 - Computing posterior matrices.
 - Computation allocated took 0.00 seconds.


In [24]:
head(get_pm(m.c))

0
1.176841
1.106666
1.120824
1.077222
-0.0001619748
-3.791859e-05


In [25]:
m.c$fitted_g

0
1

0
1

0
1

0
1

0
1

0
1
