Skip to content

dewith/road-to-transformers

Repository files navigation

Road to Transformers 🌀

A YouTube Playlist to learn the foundations of deep learning and how to build a transformer.

Videos summaries

# Title Video TL;DW
1 Stanford CS230 Lecture 1 - Class Introduction & Logistics Link Andrew Ng explains deep learning's success is driven by the availability of large datasets and powerful computational resources, enabling the training of larger neural networks, leading to higher performance. CS230 aims to help students understand and become experts in building and applying deep learning systems.
2 Stanford CS230 Lecture 2 - Deep Learning Intuition Link The lecture introduces deep learning intuition and how to approach various projects using deep learning. It discusses the components of a model, such as architecture and parameters, as well as the importance of choosing the right loss function. The lecture also covers the concept of encoding and how neural network layers capture different levels of information.
3 Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math) Link In this video, Samson Zhang builds a neural network from scratch using numpy to tackle the problem of digit classification with the famous MNIST dataset. The neural network has two layers: the input layer with 784 nodes and the output layer with 10 units corresponding to the digits 0 to 9. The process involves forward propagation, applying activation functions (ReLU and Softmax), and using backward propagation (gradient descent) to optimize the weights and biases.
👨‍💻 The implementation notebook is named nn_from_scratch.ipynb.
4 Stanford CS230 Lecture 3 - Full-Cycle Deep Learning Projects Link The video discusses the concept of full-cycle deep learning applications within the context of building machine learning projects. It emphasizes the steps involved in creating successful machine learning applications, using the example of a voice-activated device, like a smart speaker.

— More videos will be added as I watch them.

ML concepts to explore

These are some concepts that I saw in videos and thought I should have a clearer understanding.

  • Weight decay
  • L2 Normalization
  • Exploding gradient
  • Adam Optimizer
  • L2 Distance
  • Gram matrix
  • This loss: L = —(y log(ypred) + (1 — y) log(l — ypred))
  • YOLO Loss Function

Math concepts to review

  • Scalar calculus: This is the study of the derivatives of scalar functions. You'll need to know how to find the derivative of a function using the limit definition, as well as the basic differentiation rules for sums, products, quotients, and composite functions.

  • Vector calculus: This is the study of the derivatives of vector functions. You'll need to know how to find the gradient, divergence, and curl of a vector field.

  • Linear algebra: This is the study of matrices and vectors. You'll need to know how to add, subtract, multiply, and transpose matrices, as well as how to find the determinant and inverse of a matrix.