# Illustration of mix-SQP solver applied to a small data set, and a large one

## Analysis setup

We begin by loading the Distributions and LowRankApprox Julia packages, as well as some function definitions used in the code chunks below.

In [1]:
using Distributions
using LowRankApprox
include("../code/julia/datasim.jl");
include("../code/julia/likelihood.jl");
include("../code/julia/ash.jl");
include("../code/julia/mixSQP.jl");

Next, initialize the sequence of pseudorandom numbers.

In [2]:
srand(1);

## Generate a small data set

Let's start with a smaller example with 50,000 samples.

In [3]:
x = normtmixdatasim(round(Int,5e4));

## Compute the likelihood matrix

Compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals, with $k = 20$. Note that the rows of the likelihood matrix are normalized by default.

In [4]:
sd = autoselectmixsd(x, gridmult = 1.425);
L  = normlikmatrix(x,sd = sd);
size(L)

(50000, 20)

## Fit mixture model using SQP algorithm 

First we run the mix-SQP algorithm once to precompile the function.

In [5]:
out = mixSQP(L, x = ones(size(L,2))/size(L,2), lowrank = "svd", verbose = false);

Observe that only a small number of iterations is needed to converge to the solution of the constrained optimization problem.

In [6]:
out = mixSQP(L, x = ones(size(L,2))/size(L,2), convtol = 1e-8,
             pqrtol = 1e-10, eps = 1e-8, sptol = 1e-3,
             maxiter = 100, maxqpiter = 100,
             lowrank = "svd", seed = 1, verbose = true);

Running SQP algorithm with the following settings:
- 50000 x 20 data matrix
- convergence tolerance = 1.00e-08
- zero threshold        = 1.00e-03
- partial SVD tolerance  = 1.00e-10
- partial SVD max. error = 4.49e-09
iter       objective -min(g+1) #nnz #qp
   1 2.90798847e+04 +5.81e-01   20   0
   2 2.01513705e+04 +5.61e+04    2  32
   3 1.25759810e+04 +1.96e+04    3  55
   4 1.11441786e+04 +8.60e+03    3   9
   5 1.08869695e+04 +4.04e+03    3  24
   6 1.07736291e+04 +2.01e+03    3  12
   7 1.06041886e+04 +9.91e+02    2  13
   8 1.04170898e+04 +5.06e+02    3  21
   9 1.03458093e+04 +2.56e+02    3  10
  10 1.02580490e+04 +1.28e+02    3  13
  11 1.00593430e+04 +6.35e+01    3  21
  12 9.97666976e+03 +3.23e+01    3   7
  13 9.93357372e+03 +1.60e+01    3   9
  14 9.86711011e+03 +7.82e+00    3   7
  15 9.79793040e+03 +3.79e+00    4  20
  16 9.76514031e+03 +1.81e+00    4   4
  17 9.74910313e+03 +8.03e-01    4   6
  18 9.73503631e+03 +3.31e-01    5  17
  19 9.72756507e+03 +1.23e-01    5   9
 

## Generate a larger data set

Next, let's see what happens when we use the SQP algorithm to fit a mixture model to a much larger data set.

In [7]:
x = normtmixdatasim(round(Int,1e6));

## Compute the likelihood matrix

As before, we compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of $k = 100$ normal densities.

In [8]:
sd = autoselectmixsd(x, gridmult = 1.0705);
L  = normlikmatrix(x,sd = sd);
size(L)

(1000000, 99)

## Fit mixture model using SQP algorithm 

Even on this much larger data set, only a small number of iterations is needed to compute the solution.

In [9]:
out = mixSQP(L, x = ones(size(L,2))/size(L,2), convtol = 1e-8,
             pqrtol = 1e-10, eps = 1e-8, sptol = 1e-3,
             maxiter = 100, maxqpiter = 100,
             lowrank = "qr", seed = 1, verbose = true);

Running SQP algorithm with the following settings:
- 1000000 x 99 data matrix
- convergence tolerance = 1.00e-08
- zero threshold        = 1.00e-03
- partial QR tolerance  = 1.00e-10
- partial QR max. error = 3.18e-08
iter       objective -min(g+1) #nnz #qp
   1 6.97208422e+05 +7.94e-01   99   0
   2 4.83513864e+05 +3.53e+04    2 100
   3 2.74936210e+05 +1.24e+04    3  97
   4 2.19452875e+05 +5.48e+03    3 100
   5 2.12324883e+05 +2.60e+03    3 100
   6 2.09485941e+05 +1.29e+03    3  69
   7 2.06672331e+05 +6.42e+02    2  76
   8 2.04404957e+05 +3.22e+02    3  54
   9 2.02563522e+05 +1.62e+02    3  37
  10 2.00904496e+05 +8.18e+01    3  43
  11 1.99290408e+05 +4.08e+01    2  43
  12 1.98204985e+05 +2.03e+01    3  48
  13 1.97444173e+05 +1.00e+01    3  27
  14 1.96866649e+05 +4.93e+00    4  39
  15 1.96504593e+05 +2.38e+00    4  45
  16 1.96244381e+05 +1.12e+00    5  48
  17 1.96041792e+05 +4.88e-01    4  35
  18 1.95903171e+05 +1.86e-01    5  58
  19 1.95828682e+05 +5.19e-02    5  54
 

## Session information

The section gives information about the computing environment used to generate the results contained in this
notebook, including the version of Julia, and the versions of the Julia packages used here. 

In [10]:
Pkg.status("Distributions");
Pkg.status("LowRankApprox");

 - Distributions                 0.15.0
 - LowRankApprox                 0.1.0


In [11]:
versioninfo()

Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)
