A Gaussian Mixture Model fit with the Expectation-Maximization (EM) algorithm, implemented from scratch in NumPy. Demonstrated on a 3D toy dataset (3,500 points clustered around the corners of a cube).
Fits a
The dataset (dataGMM.txt) has 3,500 points clustered near the corners of a cube. A quick visualization suggests
GMMs are the canonical example of a latent-variable generative model. The EM algorithm derived for GMMs is the same algorithmic skeleton you use for HMMs, factor analysis, and probabilistic PCA. Knowing the recipe by heart - and spotting numerical issues like covariance underflow before they bite - is a baseline literacy for everyone working with probabilistic models.
-
E-step: compute responsibilities
$\gamma_{nj} = P(z_n = j \mid x_n)$ using log-densities and log-sum-exp. -
M-step: closed-form ML updates for
$\pi_j, \mu_j, \Sigma_j$ , plus a small diagonal regularization$\varepsilon I$ on each covariance to keep them positive definite. - Initialization: per-octant sample means and proportions (geometry-aware, not k-means).
- Stopping: relative change in log-likelihood below tolerance (default 1e-8).
-
Inference: hard cluster id =
$\arg\max_j \gamma_{nj}$ , soft probs = the full$\gamma$ vector.
git clone https://github.com/Mathos34/gmm-from-scratch
cd gmm-from-scratch
python -m venv .venv && source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txt
jupyter notebook lab_gmm.ipynbThe notebook runs end-to-end in a few seconds.
| Quantity | Value |
|---|---|
| Number of points |
3,500 |
| Dimension |
3 |
| Visual guess for |
8 (one per cube corner) |
| Refined |
7 (one octant is empty) |
| EM iterations to converge | 6 |
| Final log-likelihood | -21,626 |
| Recovered |
within 0.1 of |
| Mixing proportions |
uniform 1/7 |
The 5 test points are all classified into the matching corner cluster with responsibility ~1.
Lab from the Advanced Machine Learning course at ECE Paris (4th-year engineering, Major Data & AI).
Built by Mathis Lacombe, AI Maker at the Intelligence Lab, ECE Paris. LinkedIn · Hugging Face