# Gaussian mixture  
A Gaussian Mixture Model represents a composite distribution whereby points are drawn from one of k Gaussian sub-distributions, each with its own probability. The MLlib implementation uses the expectation-maximization algorithm to induce the maximum-likelihood model given a set of samples. The implementation has the following parameters:  
* k is the number of desired clusters.
* convergenceTol is the maximum change in log-likelihood at which we consider convergence achieved.
* maxIterations is the maximum number of iterations to perform without reaching convergence.
* initialModel is an optional starting point from which to start the EM algorithm. If this parameter is omitted, a random starting point will be constructed from the data.

## Examples
In the following example after loading and parsing data, we use a GaussianMixture object to cluster the data into two clusters. The number of desired clusters is passed to the algorithm. We then output the parameters of the mixture model.

In [1]:
val PATH = "file:///Users/lzz/work/SparkML/"

import org.apache.spark.mllib.clustering.GaussianMixture
import org.apache.spark.mllib.clustering.GaussianMixtureModel
import org.apache.spark.mllib.linalg.Vectors

// Load and parse the data
val data = sc.textFile( PATH + "data/mllib/gmm_data.txt")
val parsedData = data.map(s => Vectors.dense(s.trim.split(' ').map(_.toDouble))).cache()

// Cluster the data into two classes using GaussianMixture
val gmm = new GaussianMixture().setK(2).run(parsedData)

// Save and load model
gmm.save(sc, "myGMMModel")
val sameModel = GaussianMixtureModel.load(sc, "myGMMModel")

// output parameters of max-likelihood model
for (i <- 0 until gmm.k) {
  println("weight=%f\nmu=%s\nsigma=\n%s\n" format
    (gmm.weights(i), gmm.gaussians(i).mu, gmm.gaussians(i).sigma))
}

weight=0.520539
mu=[-0.10417102078802391,0.0427872221534813]
sigma=
4.899332819692213   -2.002581397114315  
-2.002581397114315  1.0098665429782792  

weight=0.479461
mu=[0.07229831267945304,0.01670331409541629]
sigma=
4.787904921970295   1.8805068924401194  
1.8805068924401194  0.9161624104539309  

