analysis/mixem.Rmd

---
title: "Accelerating co-ordinate ascent updates for linear regression using DAAREM"
date: June 4, 2019
site: workflowr::wflow_site
output: workflowr::wflow_html
---

Here we illustrate the use of [DAAREM][daarem-paper] to accelerate a
very simple EM algorithm---the E and M steps are implemented in three
lines of R code---for computing maximum-likelihood estimates of
mixture proportions in a mixture model.

```{r knitr, echo=FALSE}
knitr::opts_chunk$set(comment = "#",results = "hold",collapse = TRUE,
                      fig.align = "center")
```

## Set up environment

Load some packages and function definitions used in the example below.

```{r load-pkgs, warning=FALSE, message=FALSE}
library(ggplot2)
library(cowplot)
library(daarem)
source("../code/misc.R")
source("../code/mixem.R")
```

## Load data set

Load the 100,000 x 100 conditional likelihood matrix computed from a
simulated data set.

```{r load-data}
load("../data/mixdata.RData")
n <- nrow(L)
m <- ncol(L)
cat(sprintf("Loaded %d x %d data matrix.\n",n,m))
```

Set the initial estimate of the mixture proportions.

```{r}
x0 <- rep(1/m,m)
```

## Run basic EM updates

Compute maximum-likelihood estimates of the mixture proportions by
running 200 iterations of the standard EM updates. Note that the E and
M steps are very simple, and easy to implement in R; in particular, in
function `mixem.update`, the E step is implemented in 2 lines of R
code, and the M step requires only one more line of code.

```{r fit-em}
out <- system.time(fit1 <- mixem(L,x0,numiter = 200))
f1  <- mixobjective(L,fit1$x)
cat(sprintf("Computation took %0.2f seconds.\n",out["elapsed"]))
cat(sprintf("Log-likelihood at EM estimate is %0.12f.\n",f1))
```

## Run accelerated EM

Re-run the EM updates, this time using DAAREM to accelerate
convergence toward the solution.

```{r fit-daarem}
out <- system.time(fit2 <- mixdaarem(L,x0,numiter = 200,order = 4))
f2  <- mixobjective(L,fit2$x)
cat(sprintf("Computation took %0.2f seconds.\n",out["elapsed"]))
cat(sprintf("Objective value at DAAREM estimate is %0.12f.\n",f2))
```

Observe that the this second estimate has a much higher likelihood.

## Plot improvement in solution over time

This plot shows the improvement in the solution over time for the two
co-ordinate ascent algorithms: the vertical axis ("distance to best
solution") shows the difference between the largest log-likelihood
obtained, and the log-likelihood at the "gold-standard" solution. The
gold-standard solution was computed using [mixsqp][mixsqp].

```{r plot-iter-vs-objective, fig.height=4, fig.width=6}
f    <- mixobjective(L,x)
pdat <-
  rbind(data.frame(iter = 1:200,dist = f - fit1$value,method = "EM"),
        data.frame(iter = 1:200,dist = f - fit2$value,method = "DAAREM"))
p <- ggplot(pdat,aes(x = iter,y = dist,col = method)) +
  geom_line(size = 1) +
  scale_y_continuous(trans = "log10",breaks = 10^seq(-4,4)) +
  scale_color_manual(values = c("darkorange","dodgerblue")) +
  labs(x = "iteration",y = "distance from solution")
print(p)
```

From this plot, we see that the accelerated EM methods gets much
closer to the solution, although it seems to "plateau" after about 100
iterations. Nonetheless, it is much improved over the basic EM
algorithm.

[daarem-paper]: https://doi.org/10.1080/10618600.2019.1594835
[mixsqp]: https://github.com/stephenslab/mixsqp