<a href="https://colab.research.google.com/github/somilasthana/MachineLearningSkills/blob/master/GMM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
"""
GMM: A Gaussian mixture model is a probabilistic model that assumes all the 
data points are generated from a mixture of a finite number of Gaussian 
distributions with unknown parameters.

One can think of mixture models as generalizing k-means clustering to 
incorporate information about the covariance structure of the data as well as 
the centers of the latent Gaussians.

The GaussianMixture object implements the expectation-maximization (EM) 
algorithm for fitting mixture-of-Gaussian models. It can also draw confidence 
ellipsoids for multivariate models, and compute the 
Bayesian Information Criterion to assess the number of clusters in the data.

+ : It is the fastest algorithm for learning mixture models
    It will not bias the means towards zero, or bias the cluster sizes to have specific structures
    

- : When one has insufficiently many points per mixture, estimating the covariance 
    matrices becomes difficult, and the algorithm is known to diverge
    
    This algorithm will always use all the components it has access to
"""

In [0]:
"""
The main difficulty in learning Gaussian mixture models from unlabeled data 
is that it is one usually doesn’t know which points came from which latent 
component (if one has access to this information it gets very easy to fit a 
separate Gaussian distribution to each set of points).

EM Algo:

First one assumes random components and computes for each point a probability of 
being generated by each component of the model.
Then, one tweaks the parameters to maximize the likelihood of the data given those assignments.

"""

In [0]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np

In [0]:
from sklearn.datasets import load_iris

iris = load_iris()

X=iris.data
y=iris.target

In [0]:
X = X[:, :2]

In [0]:
X.shape

In [0]:
plt.scatter(X[:, 0], X[:, 1], c=y, s=40, cmap='viridis') #  Real

In [0]:
from sklearn import mixture
gmm =  mixture.GaussianMixture(n_components=3, covariance_type='full').fit(X)
labels = gmm.predict(X)
plt.scatter(X[:, 0], X[:, 1], c=labels, s=40, cmap='viridis') #  GMM Picture