(work in progress)
These are my notes and exercises from Andrej Karpathy's course Neural Networks: Zero to Hero.
The course on GitHub: https://github.com/karpathy/nn-zero-to-hero
The course starts neural networks from scratch with explicitly calculating the chain rule used in backpropagation for various neural network operations, and culminates in building a transformer model along the lines of the famous 'attention is all you need' paper. Going along, a simple neural network library is built, starting with a simple tensor-like class that tracks gradients, and ending with neural network layers with a PyTorch-like interface.
Lecture 1: The spelled-out intro to neural networks and backpropagation: building micrograd
Lecture 2: The spelled-out intro to language modeling: building makemore
Lecture 3: Building makemore Part 2: MLP
Lecture 4: Building makemore Part 3: Activations & Gradients, BatchNorm
Lecture 5: Building makemore Part 4: Becoming a Backprop Ninja
Lecture 6: Building makemore Part 5: Building WaveNet
Lecture 7: Let's build GPT: from scratch, in code, spelled out.