# Energy Based Models and the Score Function -- Learning Path

## Welcome to the Vizuara Course on Energy-Based Models

This series of notebooks takes you on a guided journey from the foundations of energy-based modeling to the score function -- the key idea that powers modern diffusion models.

### Prerequisites
- Basic PyTorch (tensors, autograd, nn.Module)
- Familiarity with probability distributions (Gaussian, sampling)
- Calculus (gradients, derivatives)

### Notebook Series

**Notebook 1: Energy Functions and the Boltzmann Distribution**
- What is an energy function?
- How does the Boltzmann distribution convert energy to probability?
- Why is the partition function intractable?
- Hands-on: implement and visualize energy landscapes

**Notebook 2: The Score Function and Langevin Dynamics**
- The score function as a compass toward high probability
- How the score bypasses the partition function
- Langevin dynamics: sampling with score + noise
- Hands-on: implement Langevin sampling from known distributions

**Notebook 3: Score Matching and Denoising Score Matching**
- Learning the score from data (Hyvarinen 2005)
- Denoising Score Matching: a faster alternative (Vincent 2010)
- Connection to DDPM and noise prediction
- Hands-on: train a score network and generate 2D samples

**Notebook 4: Full Pipeline -- Multi-Scale Score Matching**
- Why multiple noise scales matter
- Noise-conditioned score networks
- Annealed Langevin dynamics
- Hands-on: generate complex multi-modal distributions
- Bridge to modern diffusion models (DDPM, Score SDE)

### How to Use These Notebooks
1. Open each notebook in Google Colab (Runtime > Change runtime type > GPU)
2. Run each cell sequentially
3. Complete the TODO sections before looking at solutions
4. Use the reflection questions to deepen your understanding

### Estimated Time
- Notebook 1: 45 minutes
- Notebook 2: 60 minutes
- Notebook 3: 75 minutes
- Notebook 4: 90 minutes
- Total: ~4.5 hours

### References
- Hyvarinen (2005): Estimation of Non-Normalized Statistical Models by Score Matching
- Vincent (2010): A Connection Between Score Matching and Denoising Autoencoders
- Song & Ermon (2019): Generative Modeling by Estimating Gradients of the Data Distribution
- Ho et al. (2020): Denoising Diffusion Probabilistic Models