# Gaussian Mixture Models

<img src='macro_gaus.png' width=600px>
<img src='1D_gaus.png' width=600px>

# Gaussian Distributions (Normal)
<img src='norm_dist.png' width=600px>
<img src='gaus_dist1.png' width=600px>
<img src='gaus_dist2.png' width=600px>
<img src='gaus_dist3.png' width=600px>
<img src='gaus_dist4.png' width=600px>

__________

## Gaussian Mixture Model (GMM) Clustering
Combine both tests' data points as they are on the same scale (0-100) and we have mixed the gaussian distributions
<img src='gaus_mix1.png' width=600px>
* The distributions are preserved, but they do not make one Gaussian distribution, rather the mixing allows them to exist simultaneously in the same range
* Without knowing which data points came from which distribution, the distributions can be infered and the points with the higher probability of belonging to one gaussian distribution in the mixture over another are predicted to be as such, assigned to the respective test
> 1 Mix the data
<img src='gaus_mix2.png' width=600px>
> 2 Find the data's distribution as a whole
<img src='gaus_mix3.png' width=600px>
> 3 Determine the different distributions and assign the data points based to the distribution that the probability is highest they belong to
<img src='gaus_mix4.png' width=600px>

# Gaussian Distribution | Two Dimensions
<img src='gaus_2d1.png' width=900px>

* Plotting the two scores against one another yeilds the above visualization

> * Scatter in the middle
> * Histograms at the top and side (revealing each follows a gaussian distribution)
> * Orange "+" is the mean of each
> * Circles are the standard deviations (just as before)
>> * First circle contains 68% of the data
>> * Second contains 95%
>> * Third contains 99%

**Note the two different mixes (total of 4 distributions**
<img src='gaus_2d2.png' width=600px>
**Combining them, we can infer their original lables**
<img src='gaus_2d3.png' width=600px>

# Using GMM for prediction - Expectation Maximization

<img src='gaus_exp1.png' width=900px>

* To begin, it is common to use K-Means first to distinguish what kind of clusters there may be and then determine the number of distributions for `Step 1: Initialize Gaussian Distributions`

* `Step 1:` Initialize Gaussian Distributions by determining a mean and variance
<img src='gaus_exp2.png' width=900px>


* `Step 2:` Soft-cluster the Data Points (Expectation Step)
<img src='gaus_exp3.png' width=900px>

* `Step 3:` Re-Estimate Parameters of Gaussians Maximization Step
<img src='gaus_exp4.png' width=900px>
<img src='gaus_exp4a.png' width=900px>

* `Step 4:` Evaluate log-likelihood
> the higher this value, the more sure we are that this point is correctly classfied
<img src='gaus_exp5.png' width=900px>

# Implementing with SKLearn

In [3]:
from sklearn import datasets, mixture
#Load dataset
X = datasets.load_iris().data[:10]

# Specify the parameters for the clustering
gmm = mixture.GaussianMixture(n_components=3)
gmm.fit(X)
clustering = gmm.predict(X)
# "Clustering" now contains an array representing which each point belongs to:
# [1 0 0 0 1 2 0 1 0 0]

# GMM Overview
Paper: [Nonparametric discovery of human routines from sensor data](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.681.3152&rep=rep1&type=pdf)

Paper: [Application of the Gaussian mixture model in pulsar astronomy](https://arxiv.org/abs/1205.6221)

Paper: [Speaker Verification Using Adapted Gaussian Mixture Models](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.338&rep=rep1&type=pdf)

Paper: [Adaptive background mixture models for real-time tracking](http://www.ai.mit.edu/projects/vsam/Publications/stauffer_cvpr98_track.pdf)

Video: https://www.youtube.com/watch?v=lLt9H6RFO6A

<img src='gaus_overv.png' width=900px>