# Gaussian Mixture Models (GMM)

Gaussian Mixture Models are a **probabilistic clustering method** that assumes data points are generated from a mixture of several Gaussian distributions.

Unlike KMeans (which assigns each point to one cluster), GMM assigns probabilities for belonging to each cluster.

## Why use GMM?
- Can handle **elliptical clusters** (unlike KMeans which works best with spherical clusters).
- Provides **probabilistic cluster assignments**.
- More flexible for real-world datasets.

## Import Libraries and Dataset

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.mixture import GaussianMixture

# Load Iris dataset
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df.head()

## Applying Gaussian Mixture Model

In [None]:
# Create GMM with 3 components (since Iris has 3 species)
gmm = GaussianMixture(n_components=3, random_state=42)
df['cluster'] = gmm.fit_predict(df)

df.head()

## Visualizing the Clusters

In [None]:
plt.scatter(df['sepal length (cm)'], df['sepal width (cm)'], 
            c=df['cluster'], cmap='coolwarm', alpha=0.7)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Gaussian Mixture Model Clustering on Iris Dataset')
plt.show()

## Key Notes:
- GMM models data as a **combination of Gaussian distributions**.
- Each point gets a **probability distribution** across clusters.
- More flexible than KMeans because it accounts for different cluster shapes.
- Works well with overlapping clusters.

GMM is widely used in **speech recognition, anomaly detection, and image segmentation**.