Skip to content
DSCI 572: Supervised Learning II
Jupyter Notebook
Branch: master
Clone or download
Latest commit 59ae354 Jan 31, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
labs initial commit Jan 31, 2019
lectures initial commit Jan 31, 2019 Fix links to labs Jan 31, 2019

DSCI 572: Supervised Learning II

An introduction to optimization for machine learning. Computation of derivatives. Deep learning.

2019 Instructor: Mike Gelbart

2019 Lecture Schedule

Course structure: this course will be delivered as a flipped classroom, which means you are expected to watch the lecture videos before class. While you're not expected to understand everything just from watching a video once, you're expected to come to class with a basic familiarity of what it's all about; we won't be starting again from scratch in class. During the lecture time itself, we will work on practical examples in Python. As a result of the extra 1-2 hours of time spent watching the videos per week, we will aim to make the labs shorter than in other courses.

Some context: these videos are from Mike's undergraduate machine learning course, CPSC 340, which contains a lot of the same material. They were filmed in January-April 2018. You can find the accompanying slides, and some supplementary readings, here. By the end of MDS we will cover roughly everything in CPSC 340, although often skipping over the implementation details. On the other hand, we cover machine learning topics that do not appear in CPSC 340, like time series data and natural language data. In MDS we also benefit greatly from the statistical perspective of "stat stream" courses.

Video timings: video links have start times embedded in them, which is where you are supposed to start watching from. End times are specified below if you're not supposed to watch the whole video. I recommend watching the videos at 1.25x speed.

# Topic To watch before class Optional reading
1 Gradient descent Optimization video, Gradient descent video
2 Numerical errors none Chapter 2 of Ascher and Greif's book, available online for students (you must be on campus wifi)
3 Computing derivatives none Autograd tutorial, Automatic Differentiation in Machine Learning: a Survey, Dougal Maclaurin's Thesis section 2.5 and Chapter 4
4 Neural networks: predict 3Blue1Brown's But what is a Neural Network?, Second part of neural network predict video DL Book chapter 6, video, Fortune article, various resources below
5 Stochastic gradient, neural networks: fit 3Blue1Brown's Gradient descent, how neural networks learn, Stochastic gradient video, First part of neural network fit video up to 36:00. Why Momentum Really Works, Bottoming out, DL Book chapters 7-8, An overview of gradient descent optimization algorithms
6 Convolutions, intro to convolutional neural networks Second part of neural network fit video Convolutional Neural Networks (many students found this one helpful! ⭐️), An Intuitive Explanation of Convolutional Neural Networks, ConvNet notes, DL Book chapter 9
7 Convolutional neural networks: predict Convolutional neural network video Notes from Coursera Deep Learning courses by Andrew Ng, Visualizing and Understanding Convolutional Neural Networks, The Building Blocks of Interpretability
8 CNN review, deep learning wrap-up, supervised learning diagnostics Part 2 of Ali Rahimi @ NIPS 2017 (video, 10min at 1x speed) 👈 the entire Ali Rahimi video, some disturbing results and follow up papers here (see Figure 1 😱) and here, deepart, paper on artwork, kaggle competition writeup


  1. Optimization, gradient descent
  2. Floating point issues, computing derivatives
  3. Neural networks
  4. Convolutional neural networks

Reference Material

ML-related textbooks

  • James, Gareth; Witten, Daniela; Hastie, Trevor; and Tibshirani, Robert. An Introduction to Statistical Learning: with Applications in R. 2014. Plus Python code and more Python code.
  • Russell, Stuart, and Peter Norvig. Artificial intelligence: a modern approach. 1995.
  • David Poole and Alan Mackwordth. Artificial Intelligence: foundations of computational agents. 2nd edition (2017). Free e-book.
  • Kevin Murphy. Machine Learning: A Probabilistic Perspective. 2012.
  • Christopher Bishop. Pattern Recognition and Machine Learning. 2007.
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining. 2005.
  • Mining of Massive Datasets. Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman. 2nd ed, 2014.

Math for ML

Other ML resources

Deep learning resources

You can’t perform that action at this time.