## 1. Gaussian Mixture Models (GMM)

* **What is GMM?**
  It is a way to group data by assuming that the data comes from several groups (clusters), each shaped like a bell curve (Gaussian/normal distribution).

* **Why use GMM?**
  Unlike K-Means that puts each data point into only one group, GMM says each data point can belong to several groups at the same time, but with different probabilities.

* **How does it work?**
  The data’s overall shape is made by mixing several bell curves. Each curve has its own center (mean), shape (covariance), and weight (how big it is).

---

## 2. Expectation-Maximization (EM) Algorithm

* **What is EM?**
  It’s a step-by-step method to find the best guess for the parameters (centers, shapes, and weights) of those bell curves.

* **How EM works in GMM:**

  1. **Start:** Guess the centers, shapes, and weights for each group.
  2. **E-step:** For each data point, calculate how likely it belongs to each group (soft assignment).
  3. **M-step:** Update the centers, shapes, and weights based on these likelihoods.
  4. **Repeat** these two steps until the parameters don’t change much.

---

## 3. Soft Clustering

* **What is soft clustering?**
  Instead of forcing a point into just one cluster, soft clustering says a point can belong to multiple clusters with some probability.

* **Why is it useful?**
  It helps when groups overlap or when we aren’t sure which group a point belongs to. It gives more information about the data structure.

---

### Simple Summary Table

| Concept                  | What it Means                                | Main Idea                                   |
| ------------------------ | -------------------------------------------- | ------------------------------------------- |
| Gaussian Mixture Model   | Data comes from several bell curves          | Mix of multiple Gaussian distributions      |
| Expectation-Maximization | Step-by-step way to find best groups         | Calculate probabilities, then update groups |
| Soft Clustering          | Points belong to clusters with probabilities | Probabilistic membership, not hard labels   |


In [8]:
import numpy as np

# Data and initialization
y = np.array([1.0, 1.5, 2.0, 5.0, 6.0, 6.5])
K = 2
np.random.seed(0)
pi = np.array([0.5, 0.5])
mu = np.array([1.0, 5.0])
sigma2 = np.array([1.0, 1.0])

In [9]:
def gaussian_pdf(x, mean, var):
    return (1/np.sqrt(2 * np.pi * var)) * np.exp(-(x - mean)**2 / (2 * var))

def em_step(y, pi, mu, sigma2):
    N = len(y)
    K = len(pi)
    gamma = np.zeros((N, K))
    for i in range(N):
        denom = 0
        for k in range(K):
            gamma[i, k] = pi[k] * gaussian_pdf(y[i], mu[k], sigma2[k])
            denom += gamma[i, k]
        gamma[i, :] /= denom
    N_k = np.sum(gamma, axis=0)
    for k in range(K):
        mu[k] = np.sum(gamma[:, k] * y) / N_k[k]
        sigma2[k] = np.sum(gamma[:, k] * (y - mu[k])**2) / N_k[k]
        pi[k] = N_k[k] / N
    return pi, mu, sigma2, gamma

# Train for 5 iterations
for _ in range(5):
    pi, mu, sigma2, gamma = em_step(y, pi, mu, sigma2)

# New data point to predict
x_new = 4.0

# Calculate responsibilities for the new point
probs = np.array([pi[k] * gaussian_pdf(x_new, mu[k], sigma2[k]) for k in range(K)])
probs /= probs.sum()  # normalize to sum to 1

print(f"Predicted cluster probabilities for x={x_new}: {probs}")


Predicted cluster probabilities for x=4.0: [8.27444470e-07 9.99999173e-01]
