- What is machine learning? (10 min) [slides]
- Supervised and unsupervised learning (6 min) [slides]
- Practical notebook: Python and NumPy [colab]
- Simple linear regression (14 min) [slides]
- Vector and matrix derivatives (13 min) [slides]
- Multiple linear regression - Model and loss (16 min) [slides]
- Multiple linear regression - Optimisation (8 min) [slides]
- Polynomial regression and basis functions (15 min) [slides]
- Overfitting (10 min) [slides]
- Regularisation (15 min) [slides]
- Evaluation and interpretation (11 min) [slides]
- Practical notebook: Linear regression [colab]
- Training, validating and testing (18 min) [slides]
- Maximum likelihood estimation (20 min) [slides]
- Multivariate Gaussian distribution (5 min) [slides]
- Task (9 min) [slides]
- K-nearest neighbours (15 min) [slides]
- Bayes classifier and naive Bayes (17 min) [slides]
- Generative vs discriminative (8 min) [slides]
- Model and loss (14 min) [slides]
- Gradient descent - Fundamentals (11 min) [slides]
- Optimisation (7 min) [slides]
- The decision boundary and weight vector (21 min) [slides]
- Basis functions and regularisation (6 min) [slides]
- Multiclass - One-vs-rest classification (5 min) [slides]
- Multiclass - Softmax regression (15 min) [slides]
- Feature normalisation and scaling (14 min) [slides]
- Categorical features and categorical output (9 min) [slides]
- Accuracy, precision, recall, F1 (18 min) [slides]
- Precision, recall example (10 min) [slides]
- Practical notebook: Classification [data1, data2, data3, colab]
- Intro - Decision trees for classification (10 min) [slides]
- Intro - Regression trees (12 min) [slides]
- Regression trees - Model (11 min) [slides]
- Regression trees - Algorithm (18 min) [slides]
- Regression trees - Tree pruning (9 min) [slides]
- Decision trees - Classification (7 min) [slides]
- Decision trees - Algorithm (16 min) [slides]
- Decision trees - In practice (8 min) [slides]
- Practical notebook: Decision trees [data, colab]
- Bagging (13 min) [slides]
- Random forests (7 min) [slides]
- Boosting for regression (21 min) [slides]
- AdaBoost for classification - Setup (10 min) [slides]
- AdaBoost for classification - Step-by-step (15 min) [slides]
- AdaBoost for classification - Details (11 min) [slides]
- Introduction to unsupervised learning (19 min) [slides]
- K-means clustering - Algorithm (16 min) [slides]
- K-means clustering - Details (14 min) [slides]
- Practical notebook: Clustering [data, colab]
- Introduction (16 min) [slides]
- Mathematical background (7 min) [slides]
- Setup (17 min) [slides]
- Learning (19 min) [slides]
- Minimising reconstruction (7 min) [slides]
- Relationship to SVD (9 min) [slides]
- Steps (6 min) [slides]
- Practical notebook: Dimensionality reduction [colab]
- Neural network preliminaries: Vector and matrix derivatives (5 min) [slides]
- Neural network preliminaries: The chain rule for vector derivatives (7 min) [slides]
- Neural network preliminaries: Gradient descent (4 min) [slides]
- Neural network preliminaries: Logistic regression, softmax regression and basis functions (6 min) [slides]
- From logistic regression with basis functions to neural networks (19 min) [slides]
- Why is it called a neural network? (4 min) [slides]
- Backpropagation (without forks) (31 min) [slides]
- Backprop for a multilayer feedforward neural network (4 min) [slides]
- Computational graphs and automatic differentiation for neural networks (7 min) [slides]
- Common derivatives for neural networks (7 min) [slides]
- A general notation for derivatives (in neural networks) (8 min) [slides]
- Forks in neural networks (14 min) [slides]
- Backpropagation in general (now with forks) (4 min) [slides]
- What is the difference between negative log likelihood and cross entropy? (in neural networks) (9 min) [slides]
- Neural networks in practice (7 min) [slides]
- Neural networks examples: Natural language processing (8 min) [slides]
- Embedding layers in neural networks (10 min) [slides]
- What should I read to learn about neural networks? (3 min) [slides]
I use these notes when presenting the above material in in-person lectures.
- Introduction to machine learning
- Simple linear regression
- Multiple linear regression
- Vector and matrix derivatives
- Polynomial regression and basis functions
- Overfitting and regularisation
- Regression: Evaluation and interpretation
- Training, validating and testing
- Probability refresher
- Maximum likelihood estimation
- Multivariate Gaussian distribution
- Classification
- Gradient descent
- Binary logistic regression
- Multiclass logistic regression
- Preprocessing: Normalisation, scaling and categorical data
- Classification evaluation
- Introduction to trees
- Ensemble methods
- Introduction to unsupervised learning
- K-means clustering
- Principal components analysis
- Introduction to neural networks
- Revisiting supervised and unsupervised learning
These videos are heavily inspired by three courses: the MLPR course taught by Iain Murray at the University of Edinburgh; the machine learning course taught by Greg Shakhnarovich at TTI-Chicago; and the Coursera machine learning course taught by Andrew Ng. I also consulted the textbook An Introduction to Statistical Learning, especially for examples.
Herman Kamper, 2020-2024
This work is released under a Creative Commons Attribution-ShareAlike
license (CC BY-SA 4.0).