# 📊 Week 2: Probability and Machine Learning Foundations

---

## 🎯 Objective

To understand the **mathematical foundations** necessary for working with generative models. This includes essential **probability theory**, **statistics**, and **core machine learning concepts** that underpin generative AI algorithms.

---

## 🎲 Probability & Statistics Refresher

### 🔢 Key Concepts

- **Random Variables**: Variables whose possible values are outcomes of a random phenomenon.
- **Probability Distributions**: Describe how probabilities are distributed over values (e.g., Normal, Bernoulli, Binomial, Multinomial).
- **Mean (μ)** and **Variance (σ²)**: Central tendency and spread of distributions.
- **Expectation**: Average value of a random variable:  
  \( \mathbb{E}[X] = \sum x \cdot P(x) \)
- **Conditional Probability**:  
  \( P(A|B) = \frac{P(A \cap B)}{P(B)} \)
- **Bayes’ Theorem**:  
  \( P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \)
- **KL Divergence**: A measure of how one probability distribution diverges from another:  
  \( D_{KL}(P || Q) = \sum P(x) \log \frac{P(x)}{Q(x)} \)

---

## 🧠 Machine Learning Foundations

### 🧮 Supervised vs. Unsupervised Learning

|                      | **Supervised Learning**         | **Unsupervised Learning**         |
|----------------------|----------------------------------|------------------------------------|
| **Goal**             | Predict labels                  | Discover hidden patterns           |
| **Input**            | Features + labels               | Features only                      |
| **Examples**         | Regression, Classification      | Clustering, Dimensionality Reduction |

### 📈 Loss Functions
- **MSE (Mean Squared Error)** – Regression tasks
- **Cross-Entropy Loss** – Classification tasks
- **Binary Cross-Entropy** – Binary classification
- **KL Divergence** – Used in VAEs, distributions

### 🔧 Optimization Techniques
- **Gradient Descent**
- **Backpropagation**
- **Learning Rate Scheduling**
- **Stochastic Gradient Descent (SGD)**
- **Adam Optimizer**

---

## 🤖 Why These Foundations Matter for Generative AI

| Concept                   | Importance in Generative Models                                 |
|---------------------------|------------------------------------------------------------------|
| **Probability**           | Generative models model **data distributions**                  |
| **Bayes’ Theorem**        | Basis for **VAEs**, probabilistic inference                     |
| **KL Divergence**         | Used in training VAEs and comparing model outputs                |
| **Latent Variables**      | Represent **hidden features** in data (e.g., z in VAE)           |
| **Loss Functions**        | Crucial for training models like **GANs, VAEs, Diffusion**       |
| **Optimization**          | Used in training **deep neural networks** underlying GenAI      |

---

## 🧪 Optional Practice / Assignments

1. Compute:
   - Mean, variance, and entropy of a small dataset
   - KL Divergence between two simple distributions
2. Classify points using:
   - Logistic Regression
   - Decision Tree
3. Visualize:
   - Probability distributions using matplotlib/seaborn

---

## 📚 Suggested Resources

- [StatQuest YouTube](https://www.youtube.com/user/joshstarmer) – Excellent visual explanations
- *Pattern Recognition and Machine Learning* by Christopher Bishop
- *Probabilistic Machine Learning* by Kevin Murphy
- [Khan Academy – Statistics & Probability](https://www.khanacademy.org/math/statistics-probability)

---

> 🧠 **Pro Tip for Students:** A strong foundation in probability and optimization will help you deeply understand how generative models *learn* to generate content.
