# 📜 Unsupervised Learning in AI, ML, and Deep Learning

---

## 🔹 Definition
- **Unsupervised learning** = algorithms discover patterns, structures, or representations from **unlabeled data**.  
- **Goal:** Learn the underlying distribution, cluster data, or compress it.  
- **Key Tasks:** Clustering, dimensionality reduction, density estimation, generative modeling.  

---

## 🔹 Unsupervised Learning in Classical ML

Before deep nets, unsupervised AI/ML relied on **statistical methods**:

| **Category** | **Algorithm / Concept** | **Year** | **Authors** | **Key Idea** |
|--------------|--------------------------|----------|-------------|--------------|
| **Clustering** | k-Means | 1967 | MacQueen | Iterative centroid-based clustering. |
| | Gaussian Mixture Models (GMMs) | 1960s | Dempster et al. (EM algorithm, 1977) | Probabilistic mixture modeling. |
| **Dimensionality Reduction** | PCA | 1901 | Pearson | Orthogonal projection for variance maximization. |
| | ICA | 1994 | Comon | Separation of independent components. |
| | t-SNE | 2008 | van der Maaten | Nonlinear embedding for visualization. |
| **Association Rules** | Apriori | 1994 | Agrawal & Srikant | Market basket analysis via frequent itemsets. |

➡️ These methods dominated **exploratory data analysis** in AI/ML from the 1960s–2000s.  

---

## 🔹 Unsupervised Learning in Deep Learning

Deep learning extended unsupervised learning into **representation learning** and **generative modeling**:

### 1. Autoencoders (AEs)
- **Early Autoencoders** – Rumelhart, Hinton & Williams (1986).  
- **Deep Autoencoders** – Hinton & Salakhutdinov (2006, *Science*): Dimensionality reduction with deep nets.  

### 2. Probabilistic Models
- **Boltzmann Machines** – Ackley, Hinton & Sejnowski (1985).  
- **Restricted Boltzmann Machines (RBM)** – Smolensky (1986).  
- **Deep Belief Nets (DBNs)** – Hinton, Osindero & Teh (2006): Stacked RBMs for unsupervised pretraining.  

### 3. Generative Models
- **Variational Autoencoder (VAE)** – Kingma & Welling (2013).  
- **GANs** – Goodfellow et al. (2014): Adversarial training, unsupervised sample synthesis.  
- **Flow-based Models** – NICE (2014), RealNVP (2016), Glow (2018).  
- **Diffusion Models** – Sohl-Dickstein (2015) → Stable Diffusion (2022).  

### 4. Clustering & Representation Learning with Deep Nets
- **Deep Embedded Clustering (DEC)** – Xie et al. (2016).  
- **Contrastive Learning** – SimCLR (2020, Google Brain), MoCo (2020, Facebook AI), BYOL (2020, DeepMind).  

---

## 🔹 Applications
- **Computer Vision:** Image clustering, anomaly detection, unsupervised pretraining.  
- **NLP:** Word embeddings (Word2Vec, 2013; FastText, 2016).  
- **Speech:** Self-supervised audio embeddings (Wav2Vec 2.0, 2020).  
- **Sciences:** Protein structure prediction (AlphaFold leverages unsupervised embeddings).  

---

## ✅ Key Insights
- **Classical ML (pre-deep):** Unsupervised = clustering & dimensionality reduction.  
- **Deep Learning (modern):** Unsupervised = generative modeling (VAE, GAN, Diffusion) + deep representation learning.  
- **AI as a whole:** Unsupervised learning is **indispensable when labels are scarce** → it underpins **self-supervised methods** and today’s **foundation models**.  
