-
Notifications
You must be signed in to change notification settings - Fork 2
/
ebmr_illustration.Rmd
179 lines (134 loc) · 6.41 KB
/
ebmr_illustration.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
---
title: "ebmr_illustration"
author: "Matthew Stephens"
date: "2021-02-16"
output: workflowr::wflow_html
editor_options:
chunk_output_type: console
---
```{r}
library(ebmr.alpha)
library(glmnet)
```
## Introduction
The goal here is to illustrate the `ebmr.alpha` package on some examples.
## Change point example
First I will use the changepoint example with linear trend-filtering
basis. I use this as a challenging example, particularly for non-convex methods: the basis is not entirely natural for the changepoint problem
and as a result the likelihood surface is very ridged.
```{r}
set.seed(100)
n = 100
p = n
X = matrix(0,nrow=n,ncol=n)
for(i in 1:n){
X[i:n,i] = 1:(n-i+1)
}
btrue = rep(0,n)
btrue[40] = 8
btrue[41] = -8
s2 = .4
y = X %*% btrue + s2*rnorm(n)
plot(y,main="true mean (black)")
lines(X %*% btrue)
```
In the `ebmr.alpha` package the prior on the regression coefficients
is determined by the function used to fit the "Empirical Bayes Normal Variances" model (`ebnv_fn`). We have the following cases implemented:
- Normal prior (Ridge regression): `ebnv_fn = ebnv.pm` (point mass prior on w; normal prior on b)
- Laplace prior (Bayesian Lasso): `ebnv_fn = ebnv.exp` (exponential prior on w; Laplace prior on b)
- Mixture of normals (Ash) prior: `ebnv_fn = ebnv.np` (non-parametric prior on w; mixture of normals prior on b)
- Mixture of Laplaces prior: `ebnv_fn = ebnv.exp_mix` (mixture of exponentials prior on w; mixture of Laplaces prior on b)
For the mixture priors you can either fix the grid or update the grid each iteration using em updates. (Note: The EM versions do not really "solve" the EBNV problem because they do not find the maximum likelihood solution for the prior: they simply do a single EM iteration to update the prior grid and mixture proportions before computing posteriors.) To make the interface simpler there are helper functions defined for this: `ebnv.np.em`, `ebnv.np.fixgrid`, `ebnv.exp_mix.em` and `ebnv.exp_mix.fixgrid`.
In addition the "regular lasso" can be obtained by using `compute_mode=TRUE` with the Laplace prior.
### EB Ridge and Lasso
Here are the simplest three methods (Ridge, Blasso and Lasso) with non-mixture priors. You can see that the Lasso overfits, suggesting that the EB approach to selecting the hyperparameters for Lasso maybe does not do so well here.
```{r}
y.fit.ebr = ebmr(X,y, maxiter = 200, ebnv_fn = ebnv.pm)
y.fit.eblasso = ebmr.update(y.fit.ebr, maxiter = 200, ebnv_fn = ebnv.exp)
y.fit.eblasso.mode = ebmr.update(y.fit.eblasso, maxiter = 200, ebnv_fn = ebnv.exp, compute_mode=TRUE)
plot(y,main="true (black), ridge (red), blasso (green) and lasso (blue)")
lines(X %*% btrue)
lines(X %*% coef(y.fit.ebr), col=2)
lines(X %*% coef(y.fit.eblasso), col=3)
lines(X %*% coef(y.fit.eblasso.mode), col=4)
```
### Ash priors (normal mixture)
To get a mixture prior we have to add a prior (to set the grid)
before running the methods.
This interface may change, but here it is for now.
I compare the result with fixed grid vs estimating the grid using EM.
You can see the em update finds much better elbo, and also essentially recovers the true solution.
```{r}
y.fit.ebash.init = ebmr.set.prior(y.fit.eblasso,ebmr.alpha:::exp2np(y.fit.eblasso$g))
y.fit.ebash.em = ebmr.update(y.fit.ebash.init, maxiter = 200, ebnv_fn = ebnv.np.em)
y.fit.ebash.fix = ebmr.update(y.fit.ebash.init, maxiter = 200, ebnv_fn = ebnv.np.fixgrid)
y.fit.ebash.fix$elbo
y.fit.ebash.em$elbo
plot(y,main="true (black), ridge (red), ash fixed grid (cyan) and em (magenta)")
lines(X %*% btrue)
lines(X %*% coef(y.fit.ebr), col=2)
lines(X %*% coef(y.fit.ebash.fix), col=5,lwd=2)
lines(X %*% coef(y.fit.ebash.em), col=6, lwd=2)
```
### Lash priors (laplace mixture)
Similarly we can use a mixture of laplaces for the prior, with
either EM update or fixed grid. Again the em update finds much better elbo, and also essentially recovers the true solution.
```{r}
y.fit.eblash.init = ebmr.set.prior(y.fit.eblasso,ebmr.alpha:::exp2np(y.fit.eblasso$g))
y.fit.eblash.em = ebmr.update(y.fit.eblash.init, maxiter = 200, ebnv_fn = ebnv.exp_mix.em)
y.fit.eblash.fix = ebmr.update(y.fit.eblash.init, maxiter = 200, ebnv_fn = ebnv.exp_mix.fixgrid)
y.fit.eblash.fix$elbo
y.fit.eblash.em$elbo
plot(y,main="true (black), ridge (red), lash fixed grid (cyan) and em (magenta)")
lines(X %*% btrue)
lines(X %*% coef(y.fit.ebr), col=2)
lines(X %*% coef(y.fit.eblash.fix), col=5,lwd=2)
lines(X %*% coef(y.fit.eblash.em), col=6, lwd=2)
```
## Half-dense scenario
This is another situation where we have seen mr.ash does poorly,
[here](mr_ash_vs_lasso.html).
Here I run the `ebmr` methods for max 20 iterations to keep compute time down.
```{r}
set.seed(123)
n <- 500
p <- 1000
p_causal <- 500 # number of causal variables (simulated effects N(0,1))
pve <- 0.95
nrep = 10
rmse_ebr = rep(0,nrep)
rmse_glmnet = rep(0,nrep)
rmse_eblasso= rep(0,nrep)
rmse_ebash.fix = rep(0,nrep)
rmse_ebash.em = rep(0,nrep)
for(i in 1:nrep){
sim=list()
sim$X = matrix(rnorm(n*p,sd=1),nrow=n)
B <- rep(0,p)
causal_variables <- sample(x=(1:p), size=p_causal)
B[causal_variables] <- rnorm(n=p_causal, mean=0, sd=1)
sim$B = B
sim$Y = sim$X %*% sim$B
sigma2 = ((1-pve)/(pve))*sd(sim$Y)^2
E = rnorm(n,sd = sqrt(sigma2))
sim$Y = sim$Y + E
fit.glmnet <- cv.glmnet(x=sim$X, y=sim$Y, family="gaussian", alpha=1, standardize=FALSE)
fit.ebr = ebmr(sim$X,sim$Y, maxiter = 20, ebnv_fn = ebnv.pm)
fit.eblasso = ebmr.update(fit.ebr, maxiter = 20, ebnv_fn = ebnv.exp)
fit.ebash.init = ebmr.set.prior(fit.eblasso,ebmr.alpha:::exp2np(y.fit.eblasso$g))
fit.ebash.em = ebmr.update(fit.ebash.init, maxiter = 20, ebnv_fn = ebnv.np.em)
fit.ebash.fix = ebmr.update(fit.ebash.init, maxiter = 20, ebnv_fn = ebnv.np.fixgrid)
#rmse_mrash[i] = sqrt(mean((sim$B-fit_mrash$beta)^2))
#rmse_mrash_fixprior[i] = sqrt(mean((sim$B-fit_mrash_fixprior$beta)^2))
rmse_glmnet[i] = sqrt(mean((sim$B-coef(fit.glmnet)[-1])^2))
rmse_ebr[i] = sqrt(mean((sim$B-coef(fit.ebr))^2))
rmse_eblasso[i] = sqrt(mean((sim$B-coef(fit.eblasso))^2))
rmse_ebash.fix[i] = sqrt(mean((sim$B-coef(fit.ebash.fix))^2))
rmse_ebash.em[i] = sqrt(mean((sim$B-coef(fit.ebash.em))^2))
}
plot(rmse_glmnet, rmse_ebr, xlim=c(0.5,0.7), ylim=c(0.5,0.7), main="red=ridge, green=eblasso")
points(rmse_glmnet,rmse_eblasso,col=3)
points(rmse_glmnet,rmse_ebash.fix,col=5)
points(rmse_glmnet,rmse_ebash.em,col=6)
abline(a=0,b=1)
```