# Comparing performance and accuracy of EM, IP and mix-SQP algorithms

In this example, we compare the runtime and accuracy of the EM algorithm, the mix-SQP algorithm, and the interior-point method implemented by the MOSEK commercial solver (and called via the `KWDual` function in the R package `REBayes`).

## Analysis setup

*Before attempting to run this Julia code, make sure your computer is properly set up to run this code by following the "Quick Start" instructions in the README of the [git repository](https://github.com/stephenslab/mixsqp-paper).*

We begin by loading the Distributions, LowRankApprox and RCall Julia packages, as well as some function definitions used in the code chunks below.

In [1]:
using Distributions
using LowRankApprox
using RCall
include("../code/datasim.jl");
include("../code/likelihood.jl");
include("../code/mixEM.jl");
include("../code/mixSQP.jl");
include("../code/REBayes.jl");

Next, initialize the sequence of pseudorandom numbers.

In [2]:
srand(1);

## Generate a small data set

Let's begin with a smaller example with 50,000 samples.

In [3]:
z = normtmixdatasim(round(Int,5e4));

## Compute the likelihood matrix

Compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals, with $k = 20$. Note that the rows of the likelihood matrix are normalized by default.

In [4]:
sd = autoselectmixsd(z,nv = 20);
L  = normlikmatrix(z,sd = sd);
size(L)

(50000, 20)

## Fit mixture model 

First we run each of the optimization algorithms once to precompile the relevant functions.

In [14]:
outem  = mixEM(L,maxiter = 100);
outip  = REBayes(L);
outsqp = mixSQP(L,verbose = false);

Next, let's fit the model using the three algorithms. 

In [6]:
@time xem, tem = mixEM(L,tol = 1e-4,maxiter = 1000);
@time xip, tip = REBayes(L);
@time outsqp   = mixSQP(L,verbose = false);

  3.479721 seconds (25.40 k allocations: 7.243 GiB, 21.60% gc time)
  1.314076 seconds (579 allocations: 30.897 KiB)
  0.276557 seconds (38.24 k allocations: 404.534 MiB, 15.43% gc time)


The mix-SQP algorithm algorithm is much faster than the other two methods, with the EM being the slowest. 

Further, the quality of the IP and SQP solutions is very similar, whereas the EM solution is much worse: 

In [7]:
fem  = mixobjective(L,xem);
fip  = mixobjective(L,xip);
fsqp = mixobjective(L,outsqp["x"]);
fbest = minimum([fem fip fsqp]);
@printf "Difference between EM and best solutions:  %0.2e\n" fem - fbest
@printf "Difference between IP and best solutions:  %0.2e\n" fip - fbest
@printf "Difference between SQP and best solutions: %0.2e\n" fsqp - fbest

Difference between EM and best solutions:  7.97e+00
Difference between IP and best solutions:  0.00e+00
Difference between SQP and best solutions: 2.22e-06


## Comparison using a larger data set

Next, let's see what happens when we apply these three algorithms to a larger data set.

In [8]:
z = normtmixdatasim(round(Int,1e5));

As before, we compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of $k = 100$ normal densities.

In [9]:
sd = autoselectmixsd(z,nv = 100);
L  = normlikmatrix(z,sd = sd);
size(L)

(100000, 100)

Now we fit the model using the three approaches. 

In [10]:
@time xem, tem = mixEM(L,tol = 1e-4,maxiter = 1000);
@time xip, tip = REBayes(L);
@time outsqp   = mixSQP(L,verbose = false);

 17.030483 seconds (12.71 k allocations: 11.820 GiB, 53.78% gc time)
 19.056084 seconds (315 allocations: 16.172 KiB)
  1.019957 seconds (134.62 k allocations: 872.181 MiB, 12.84% gc time)


In this example, the mix-SQP algorithm reaches a solution much faster than the both EM and IP approaches. 

As before, the quality of the IP and SQP solutions is similar, whereas the EM solution is much worse.

In [11]:
fem  = mixobjective(L,xem);
fip  = mixobjective(L,xip);
fsqp = mixobjective(L,outsqp["x"]);
fbest = minimum([fem fip fsqp]);
@printf "Difference between EM and best solutions:  %0.2e\n" fem - fbest
@printf "Difference between IP and best solutions:  %0.2e\n" fip - fbest
@printf "Difference between SQP and best solutions: %0.2e\n" fsqp - fbest

Difference between EM and best solutions:  1.23e+02
Difference between IP and best solutions:  0.00e+00
Difference between SQP and best solutions: 2.00e-02


## Session information

The section gives information about the computing environment used to generate the results contained in this
notebook, including the version of Julia, R and the packages used. 

In [12]:
Pkg.status("Distributions")
Pkg.status("LowRankApprox")
Pkg.status("RCall")
versioninfo()

 - Distributions                 0.15.0
 - LowRankApprox                 0.1.1
 - RCall                         0.10.2
Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)


Since we called the `KWDual` function in R, it is also useful to record information about R.

In [13]:
R"sessionInfo()"

RCall.RObject{RCall.VecSxp}
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] REBayes_1.3   Matrix_1.2-12

loaded via a namespace (and not attached):
[1] compiler_3.4.3  Rmosek_8.0.69   grid_3.4.3      lattice_0.20-35
