-
Notifications
You must be signed in to change notification settings - Fork 1
/
mixem.Rmd
86 lines (67 loc) · 2.26 KB
/
mixem.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
title: "Accelerating co-ordinate ascent updates for linear regression using DAAREM"
date: June 4, 2019
site: workflowr::wflow_site
output: workflowr::wflow_html
---
A small script to illustrate application of the DAAREM method for
computing maximum-likelihood estimates of mixture proportions in a
mixture model.
```{r knitr, echo=FALSE}
knitr::opts_chunk$set(comment = "#",results = "hold",collapse = TRUE,
fig.align = "center")
```
## Set up environment
Load some packages and function definitions used in the example below.
```{r load-pkgs, warning=FALSE, message=FALSE}
library(ggplot2)
library(cowplot)
library(daarem)
source("../code/misc.R")
source("../code/mixem.R")
```
## Load data set
TO DO: Add text here.
```{r load-data}
load("../data/mixdata.RData")
n <- nrow(L)
m <- ncol(L)
cat(sprintf("Loaded %d x %d data matrix.\n",n,m))
```
Set the initial estimate of the mixture proportions.
```{r}
x0 <- rep(1/m,m)
```
## Run basic EM updates
Compute maximum-likelihood estimates of the mixture proportions by
running 200 iterations of the EM updates.
```{r fit-em}
cat("Fitting mixture model with basic EM method.\n")
out <- system.time(fit1 <- mixem(L,x0,numiter = 200))
f1 <- mixobjective(L,fit1$x)
cat(sprintf("Computation took %0.2f seconds.\n",out["elapsed"]))
cat(sprintf("Log-likelihood at EM estimate is %0.12f.\n",f1))
```
## Run accelerated EM
Re-run the EM updates, this time using DAAREM to accelerate
convergence toward the solution.
```{r fit-daarem}
out <- system.time(fit2 <- mixdaarem(L,x0,numiter = 200))
f2 <- mixobjective(L,fit2$x)
cat(sprintf("Computation took %0.2f seconds.\n",out["elapsed"]))
cat(sprintf("Objective value at DAAREM estimate is %0.12f.\n",f2))
```
## Plot improvement in solution over time
TO DO: Add text here.
```{r plot-iter-vs-objective, fig.height=4, fig.width=6}
f <- mixobjective(L,x)
pdat <-
rbind(data.frame(iter = 1:200,dist = f - fit1$value,method = "EM"),
data.frame(iter = 1:200,dist = f - fit2$value,method = "DAAREM"))
p <- ggplot(pdat,aes(x = iter,y = dist,col = method)) +
geom_line(size = 1) +
scale_y_continuous(trans = "log10",breaks = 10^seq(-4,4)) +
scale_color_manual(values = c("darkorange","dodgerblue")) +
labs(x = "iteration",y = "distance from solution")
print(p)
```