analysis/em_vs_mixsqp.Rmd

---
title: Comparing two different ways of estimating the mixture weights in mr-ash
author: Peter Carbonetto
date: February 20, 2020
site: workflowr::wflow_site
output: workflowr::wflow_html
---

Here we compare two different ways of implementing the mixture weight
updates in mr-ash: EM and mix-SQP. In this example, we will see that
the mix-SQP updates provide a much better fit to the data.

```{r knitr, echo=FALSE}
knitr::opts_chunk$set(comment = "#",results = "hold",collapse = TRUE,
                      fig.align = "center")
```

## Script parameters

These are the data simulation settings.

```{r datasim-params}
n  <- 100
p  <- 400
sd <- c(0,   1,    2)
w  <- c(0.9, 0.05, 0.05)
s  <- 0.1
```

This specifies the variances for the mixture-of-normals prior on the
regression coefficients.

```{r set-s0}
s0 <- 10^seq(-4,0,length.out = 12)
```

## Load functions

These are the packages used in this analysis.

```{r load-pkgs, message=FALSE, warning=FALSE}
library(ggplot2)
library(cowplot)
library(MASS)
library(mixsqp)
```

This R code provides a simple implementation of the mr-ash algorithm.

```{r load-functions}
source("../code/misc.R")
source("../code/mr_ash.R")
source("../code/mr_ash_with_mixsqp.R")
```

## Simulate data

The predictors are drawn from the multivariate normal with zero mean
and covariance matrix S, in which all diagonal entries are 1, and
all off-diagonal entries are s. Setting `s = 0.5` reproduces the
simulation of the predictors used in Example 3 of Zou & Hastie
(2005).

```{r simulate-data}
set.seed(2)
S       <- matrix(s,p,p)
diag(S) <- 1
X       <- mvrnorm(n,rep(0,p),S)
k       <- sample(length(w),p,replace = TRUE,prob = w)
beta    <- sd[k] * rnorm(p)
y       <- drop(X %*% beta + rnorm(n))
```

## Fit model

These are the initial estimates of residual variance (`s`), mixture
weights (`w0`), and posterior mean estimates of the regression
coefficients (b).

```{r init-model}
k  <- length(s0)
se <- 1
w0 <- rep(1/k,k)
b  <- rep(0,p)
```

Fit the model by running 200 EM updates for the mixture weights.

```{r fit-model-em}
fit1 <- mr_ash(X,y,se,s0,w0,b,maxiter = 200,verbose = FALSE)
```

Fit the model a second time using the mix-SQP updates for the mixture
weights. The "EM", "mix" and "alpha" columns give, for each iteration,
the number of co-ordinate ascent ("inner loop") updates run, the
number of mix-SQP iterations performed, and the step size for the
mix-SQP update (as determined by backtracking line search).

```{r fit-model-mixsqp}
fit2 <- mr_ash_with_mixsqp(X,y,se,s0,w0,b,numiter = 10)
```

## Review model fit

Plot the improvement in the solution over time.

```{r plot-progress, fig.height=4, fig.width=5}
elbo.best <- max(c(fit1$elbo,fit2$elbo))
pdat      <- rbind(data.frame(update = "em",
                              iter   = 1:length(fit1$elbo),
                              elbo   = fit1$elbo),
                   data.frame(update = "mixsqp",
                              iter   = cumsum(fit2$numem),
                              elbo   = fit2$elbo))
pdat$elbo <- elbo.best - pdat$elbo + 1e-4
ggplot(pdat,aes(x = iter,y = elbo,color = update)) +
  geom_line() +
  geom_point() +
  scale_y_log10() +
  scale_color_manual(values = c("royalblue","darkorange")) +
  labs(y = "distance to \"best\" elbo") +
  theme_cowplot()
```

The algorithm with the mix-SQP updates of the mixture weights provides
a much better fit to the data.

Compare the posterior mean estimates by the against the values
used to simulate the data.

```{r plot-coefs-1, fig.height=3, fig.width=6}
p1 <- ggplot(data.frame(true = beta,em = fit1$b),
             aes(x = true,y = em)) +
  geom_point(color = "darkblue") +
  geom_abline(intercept = 0,slope = 1,col = "magenta",lty = "dotted") +
  xlim(-4,4) +
  ylim(-4,4) +
  theme_cowplot()
p2 <- ggplot(data.frame(true = beta,mixsqp = fit2$b),
             aes(x = true,y = mixsqp)) +
  geom_point(color = "darkblue") +
  geom_abline(intercept = 0,slope = 1,col = "magenta",lty = "dotted") +
  xlim(-4,4) +
  ylim(-4,4) +
  theme_cowplot()
plot_grid(p1,p2)
```

```{r plot-coefs-2, fig.height=3, fig.width=6}
ggplot(data.frame(em = fit1$b,mixsqp = fit2$b),
             aes(x = em,y = mixsqp)) +
  geom_point(color = "darkblue") +
  geom_abline(intercept = 0,slope = 1,col = "magenta",lty = "dotted") +
  xlim(-2.25,0.75) +
  ylim(-2.25,0.75) +
  theme_cowplot()
```