# 📜 Classification in AI: ML → DL Evolution

---

## 🔹 Definition
- **Classification** = a supervised learning task where an algorithm learns to assign inputs \( x \) to discrete categories (labels \( y \)).  
- **Goal:** Build a decision function  
  \[
  f(x) \;\to\; y
  \]  
- **Types:**  
  - **Binary classification:** e.g., spam vs not spam.  
  - **Multiclass classification:** e.g., digit recognition, ImageNet.  
  - **Multi-label classification:** one sample may belong to multiple categories.  

---

## 🔹 Classification in Classical ML

Before deep learning, most AI classification used **shallow ML models**:

| **Category** | **Algorithm** | **Year** | **Authors** | **Notes** |
|--------------|---------------|----------|--------------|-----------|
| **Linear Models** | Logistic Regression | 1958 | Cox, others | Probabilistic binary classification. |
| | Perceptron | 1958 | Rosenblatt | First neural classifier. |
| **Discriminant Analysis** | LDA / QDA | 1960s | Fisher, others | Linear / quadratic decision boundaries. |
| **Decision Trees** | CART | 1984 | Breiman | Tree-based supervised classification. |
| | C4.5 | 1993 | Quinlan | Successor to ID3, widely used. |
| **Ensembles** | Random Forests | 2001 | Breiman | Ensemble of trees. |
| | Gradient Boosting | 2001 | Friedman | Boosted weak classifiers. |
| | XGBoost | 2016 | Chen | Scalable gradient boosting, SOTA tabular. |
| **SVM** | Support Vector Machines | 1995 | Vapnik & Cortes | Dominant for high-dimensional features pre-DL. |

➡️ These models dominated **1980s–2000s**, especially in **text & tabular classification**.  

---

## 🔹 Classification in Deep Learning

### 1. Feedforward & Early Neural Nets
- **Multilayer Perceptrons (MLP)** – 1980s/1990s.  
  - Used for supervised classification, but hindered by vanishing gradients.  

### 2. CNNs for Image Classification
- **LeNet-5** – LeCun et al. (1998): Handwritten digit recognition.  
- **AlexNet** – Krizhevsky et al. (2012): ImageNet breakthrough, CNN dominance.  
- **VGGNet (2014), GoogLeNet (2014), ResNet (2015):** Deeper & more accurate supervised classifiers.  

### 3. RNNs for Sequence Classification
- **LSTM (1997):** Applied to supervised text & speech classification.  
- **Deep Speech (2014):** End-to-end speech classification.  
- **Seq2Seq (2014):** Translation tasks framed as classification of sequence outputs.  

### 4. Transformers in NLP Classification
- **BERT (2018):** Fine-tuning for sentiment, QA, NLI.  
- **GPT models (2018–2023):** Autoregressive Transformers adapted for classification.  
- **T5 (2020):** Cast classification as text-to-text.  

### 5. Modern Vision Transformers
- **ViT (2021):** Trained on ImageNet-21k, JFT-300M.  
- Competed with CNNs in **vision classification benchmarks**.  

---

## 🔹 Applications of Classification in AI
- **Vision:** Object recognition, defect detection, medical image diagnosis.  
- **NLP:** Sentiment analysis, intent detection, topic classification.  
- **Speech & Audio:** Speaker ID, emotion classification.  
- **Finance:** Fraud detection, credit scoring.  
- **Healthcare:** Disease classification, biomarker detection.  

---

## ✅ Key Insights
- **In Classical ML:** Classification = logistic regression, trees, SVM, ensembles.  
- **In Deep Learning:** CNNs (vision), RNNs (sequences), Transformers (NLP/vision).  
- **Today:** Classification often solved via **foundation models** (BERT, ViT, GPT) fine-tuned on labeled datasets.  
