# Gaussian Mixture Model
* In this demo, we will compare k-means and Gaussian mixture models.

### 1. Generate and visualize the data
* As in the previous demo, we use the function **make_blobs** from sklearn to generate the clusters and their labels.
* We will do a transformation of the data to make each cluster an ellipse.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

n_samples = 1500
X, y = make_blobs(n_samples=n_samples, n_features=2, centers=3, random_state=170)

transformation = [[0.60834549, -0.63667341], [-0.40887718, 0.85253229]]
X = np.dot(X, transformation)

plt.scatter(X[:, 0], X[:, 1])
plt.show()

### 2. Test the k-means algorithm

In [2]:
from sklearn.cluster import KMeans

model = KMeans(n_clusters=3).fit(X)
y_pred = model.predict(X)
plt.scatter(X[:, 0], X[:, 1], c=y_pred)
plt.scatter(model.cluster_centers_[:, 0], model.cluster_centers_[:, 1], s=50, c='red')
plt.show()

### 3. Train a Gaussian mixture model
* The API for Gaussian mixture is at: https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html

In [3]:
from sklearn.mixture import GaussianMixture

gm_model = GaussianMixture(n_components=3, covariance_type="full").fit(X)
y_pred = gm_model.predict(X)
plt.scatter(X[:, 0], X[:, 1], c=y_pred)
plt.show()

### 4. Plot the Gaussian components

In [4]:
from scipy import linalg
import matplotlib as mpl

means = gm_model.means_
covariances = gm_model.covariances_

# Plot the points
plt.scatter(X[:, 0], X[:, 1], c=y_pred)

# Plot the ellipses
for (mean, covar) in zip(means, covariances):
    v, w = linalg.eigh(covar)
    v = 2.0 * np.sqrt(2.0) * np.sqrt(v)
    u = w[0] / linalg.norm(w[0])

    angle = np.arctan(u[1] / u[0])
    angle = 180.0 * angle / np.pi  # convert to degrees
    ell = mpl.patches.Ellipse(mean, v[0], v[1], 180.0 + angle, color='red')
    ell.set_alpha(0.5)
    plt.gca().add_artist(ell)
        
plt.show()