# 📜 Clustering in AI: ML → DL Evolution

---

## 🔹 Definition
- **Clustering** = unsupervised learning task grouping data points so that items in the same cluster are more similar than items in other clusters.  
- **Goal:** Discover structure in unlabeled data.  

**Types:**  
- **Hard clustering:** each point belongs to exactly one cluster.  
- **Soft clustering:** points can belong to multiple clusters with probabilities.  
- **Hierarchical clustering:** tree-like cluster structure.  

---

## 🔹 Clustering in Classical ML

### 1. Early Statistical & Partitioning Methods
| **Method** | **Year** | **Authors** | **Key Idea** |
|------------|----------|--------------|--------------|
| **k-Means** | 1967 | MacQueen | Partition into *k* clusters, minimizing intra-cluster variance. |
| **GMM (Gaussian Mixture Models)** | 1977 | Dempster, Laird & Rubin (EM) | Probabilistic clustering with soft assignments. |
| **Hierarchical Clustering** | 1958 | Sokal & Michener | Agglomerative/divisive clustering via distance metrics. |

### 2. Graph & Density-Based Methods
| **Method** | **Year** | **Authors** | **Key Idea** |
|------------|----------|--------------|--------------|
| **DBSCAN** | 1996 | Ester et al. | Density-based clustering, robust to noise, arbitrary shapes. |
| **OPTICS** | 1999 | Ankerst et al. | Extension of DBSCAN, handles variable density. |
| **Spectral Clustering** | 2000 | Shi & Malik | Uses graph Laplacian eigenvectors for non-convex clusters. |

➡️ Classical ML clustering provided **statistical and geometric grouping tools** widely used in exploratory data analysis.  

---

## 🔹 Clustering in Deep Learning

### 1. Autoencoder-Based
- **Deep Autoencoder Clustering (2006):** Latent codes used as cluster features.  
- **Denoising Autoencoders (2008):** More robust representations for unsupervised grouping.  

### 2. Deep Embedded Clustering (DEC)
- **Xie, Girshick & Farhadi (2016):** Jointly learns **representations + clusters** via autoencoder + clustering objective.  

### 3. Generative Models
- **ClusterGAN (2018):** GAN latent codes structured for clustering.  
- **VAE-based clustering (2017–2019):** VAEs extended with Gaussian mixture priors.  

### 4. Contrastive & Self-Supervised Clustering
| **Model** | **Year** | **Authors** | **Contribution** |
|-----------|----------|--------------|------------------|
| **DeepCluster** | 2018 | Caron et al. (Facebook AI) | Alternates between clustering features & updating CNN weights. |
| **SwAV** | 2020 | Caron et al. (Facebook AI) | Online clustering with self-supervised learning, rivaling supervised pretraining. |

➡️ Deep learning pushed clustering from **static feature grouping** to **joint representation + cluster learning**, tightly integrated with SSL.  

---

## 🔹 Applications of Clustering
- **Computer Vision:** Image grouping, unsupervised feature learning.  
- **NLP:** Topic modeling, document clustering, word sense induction.  
- **Healthcare:** Patient stratification, gene expression clustering.  
- **Recommender Systems:** Grouping users/items for collaborative filtering.  
- **Anomaly Detection:** Outlier discovery as singleton clusters.  

---

## ✅ Key Insights
- **Classical ML:** k-Means, GMMs, DBSCAN, and Spectral Clustering were dominant.  
- **Deep Learning:** Autoencoders, DEC, GAN/VAE clustering, and self-supervised clustering advanced the field.  
- **Today:** Clustering is **integrated into representation learning** (e.g., DeepCluster, SwAV) and powers foundation model pretraining pipelines.  
