Lecture | Topics | Readings and useful links | Handouts |
---|---|---|---|
Intro to ML Decision Trees |
|
Mitchell: Ch 3 Bishop: Ch 14.4 The Discipline of Machine Learning |
Slides |
Decision Tree learning Review of Probability |
|
Mitchell: Ch 3 Andrew Moore's Basic Probability Tutorial |
Slides Annotated Slides |
Probability and Estimation |
|
Mitchell: Estimating Probabilities | Slides Annotated Slides |
Naive Bayes |
|
Mitchell: Naive Bayes and Logistic Regression | Slides Annotated Slides |
Gaussian Naive Bayes |
|
Mitchell: Naive Bayes and Logistic Regression | Slides Annotated Slides |
Logistic Regression |
|
Mitchell: Naive Bayes and Logistic Regression | Slides Annotated Slides |
Linear Regression |
|
Slides Annotated Slides |
|
Learning Theory I |
|
Mitchell: Ch 7 Notes on Generalization Guarantees |
Slides |
Learning Theory II |
|
Mitchell: Ch 7 Notes on Generalization Guarantees |
Slides |
Learning Theory III |
|
Slides | |
Graphical Models I |
|
Bishop chapter 8, through 8.2 | Slides Annotated Slides |
Graphical Models II |
|
Annotated Slides | |
Graphical Models III |
|
Bishop Chapter 8 Mitchell Chapter 6 |
Slides Annotated Slides |
Exam #1 | |||
EM and Clustering |
|
Bishop Chapter 8 Mitchell Chapter 6 |
Slides Annotated Slides |
Spring Break | |||
Boosting |
|
Slides | |
Adaboost, Margins, Perceptron |
|
Notes on Perceptron | Slides Slides (PPT) |
Kernels |
|
Bishop 6.1 and 6.2 | Slides |
SVM |
|
Notes on SVM by Andrew Ng | Slides |
Semi-supervised Learning |
|
Slides | |
Active Learning |
|
Slides | |
|
|
|
Slides |
|
|
|
Slides |
Never Ending Learning | Slides | ||
Neural Networks Deep Learning |
Mitchell, Chapter 4 | Slides | |
Reinforcement Learning |
|
|
Slides |
Deep Learning Differential Privacy Discussion on the Future of ML |
Slides (Privacy) Slides (Deep Nets) |
Andrew Ng's review notes on:
- Linear Algebra
- Probability Theory
- Multivariate Gaussian - I
- Multivariate Gaussian - II
- Convex Optimization - I
- Convex Optimization - II
- Hidden Markov Models
- Andrew Ng's notes on Loss Functions
- Andrew Ng's course notes on Linear and Logistic Regression.
- The Theory Behind Overfitting, Cross Validation, Regularization, Bagging and Boosting: Tutorial by Ghojogh and Crowley.
- Andrew Ng's notes on Naive Bayes
- Andrew Ng's notes on SVM and Kernel Methods
- Andrew Ng's notes on Neural Networks and Deep Learning
- [Practical Tips] - Efficient Backprop by Yann LeCun
- Vapnik's paper on "Principles of Risk Minimization for Learning Theory", NIPS 1992
- Andrew Ng's ML Review Notes
- Spectral Methods for Dimensionality Reduction
- Nonlinear Dimensionality Reduction by Locally Linear Embedding
- MATLAB Workshop 2: An introduction to Support Vector Machine implementations in MATLAB
- Neighbourhood Components Analysis
- A Global Geometric Framework for Nonlinear Dimensionality Reduction
- Graphics Principles Cheat Sheet
- Neural Machine Translation and Sequence-to-sequence Models: A Tutorial
- Exponential Families
- Variational Inference: A Review for Statisticians
- Foundations of Data Science
- Optimization Methods for Large-Scale Machine Learning
- Convex Optimization
- Theoretical Computer Science Cheat Sheet
- Entropy, Relative Entropy, And Mutual Information
- Information Theory And Statistics
- Lecture #7: Understanding and Using Principal Component Analysis (PCA)
- Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
- CS224n: Natural Language Processing with Deep Learning (Lecture Notes: Part I)
- Super VIP Cheatsheet: Machine Learning
- CSE176 Introduction to Machine Learning — Lecture notes
- Linear Algebra Review and Reference
- Lecture notes (Roger Grosse)
- Data Mining, Inference, and Prediction
- A Primer on Neural Network Models for Natural Language Processing
- Machine Learning and Data Mining Lecture Notes
- Common tests are linear models
- Linear Algebra Abridged
- The Matrix Calculus You Need For Deep Learning
- Measure, Integration and Real Analysis
- Concise Machine Learning
- Machine Learning Basic Concepts
- Notation: Overview
- Lecture Notes 1: Vector spaces
- Probability Cheatsheet
- An overview of gradient descent optimization algorithms
- Lecture 0: Course Introduction
- A Structural Approach to Selection Bias
- Troubleshooting Deep Neural Networks: A Field Guide to Fixing Your Model
- Introduction to Applied Linear Algebra
- Python For Data Science Cheat Sheet (NumPy Basics)
- Data Wrangling with pandas Cheat Sheet
- Python For Data Science Cheat Sheet (Scikit-Learn)
- Python For Data Science Cheat Sheet (Keras)
- Data Wrangling with dplyr and tidyr Cheat Sheet
- Data Visualization with ggplot2 Cheat Sheet
- CS725 : Foundations of Machine learning - Lecture Notes
- Lecture Notes: Optimization for Machine Learning
- CS446: Machine Learning Lecture Notes
- Data Science Lecture Notes
- Lectures on Numerical Methods for Data Science
(Prof. Qiangfu Zhao, Prof. Yong Liu and Prof. Yuichi Yaguchi) |
Contents |
History of AI and ML |
Fundamentals of machine learning |
Introduction to concept learning |
Basic statistic learning |
Bayesian network |
Project-I |
Multilayer perceptron |
Convolutional neural network |
Autoencoder |
Restricted Boltzmann machine |
Decision trees |
Project-II |
- [pdf] Introduction (notes)
- [pdf] Linear regression
- [pdf] Linear classification
- [pdf] Neural networks, kernel methods intro.
- [pdf] Kernel methods
- [pdf] EM and clustering (Notes)
- [pdf] Approximate inference (Notes)
- [pdf] Graphical models
- [pdf] Graphical models and message passing
- [pdf] MCMC and sampling methods
- [pdf] Bayesian nonparametric models
- Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification
- Towards Deep Learning Models Resistant to Adversarial Attacks
- Synthesizing Robust Adversarial Examples
- Robust Physical-World Attacks on Deep Learning Visual Classification
- Audio Adversarial Examples: Targeted Attacks on Speech-to-Text
- PoTrojan: powerful neuron-level trojan designs in deep learning models
- Semantic Adversarial Examples
- Programmable Neural Network Trojan for Pre-Trained Feature Extractor
- STRIP: A Defence Against Trojan Attacks on Deep Neural Networks
- A General Framework for Adversarial Examples with Objectives
- Adversarial Attacks on Neural Network Policies
- Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
- Adversarial Examples that Fool both Computer Vision and Time-Limited Humans
- Analyzing Federated Learning through an Adversarial Lens
- Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks
- Generating Natural Language Adversarial Examples
- Sparse Bayesian Adversarial Learning Using Relevance Vector Machine Ensembles
- ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
- Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
- Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks
- Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
- On Evaluating Adversarial Robustness
- Adversarial Support Vector Machine Learning
- Trojaning Attack on Neural Networks
- Learning Conjunctive Concepts in Structural Domains
- Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
- Approximation and estimation bounds for artificial neural networks
- Training a 3-Node Neural Network is NP-Complete
- Online Passive-Aggressive Algorithms
- Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
- Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework
- The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant
- Large Margin Classification Using the Perceptron Algorithm
- Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm
- General Convergence Results for Linear Discriminant Updates
- Learning representations by back-propagating errors
- Efficient Learning of Linear Perceptrons
- On the Computational Efficiency of Training Neural Networks
- Learnability and the Vapnik-Chervonenkis Dimension
- Computational Limitations on Learning from Examples
- Learning Decision Lists
- A Theory of the Learnable
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- On-Line Algorithms in Machine Learning
- Neural Programmer-Interpreters
- Dueling Network Architectures for Deep Reinforcement Learning
- scikit-learn user guide
- The Matrix Cookbook
- Big Data: New Tricks for Econometrics
- Crash Course on Basic Statistics
- Think Stats: Probability and Statistics for Programmers
ML Lecture Slides [Sargur Srihari]
ML Course Notes [Nando de Freitas]
- Lecture 1: Introduction slides
- Lecture 2: Linear prediction slides
- Lecture 3: Maximum likelihood slides.pdf
- Lectures 4 & 5: Regularizers, basis functions and cross-validation slides.pdf
- Lecture 6: Optimisation slides.pdf
- Lecture 7: Logistic regression slides.pdf
- Lecture 8: Back-propagation and layer-wise design of neural nets slides.pdf
- Lecture 9: Neural networks and deep learning with Torch slides.pdf
- Lecture 10: Convolutional neural networks slides.pdf
- Lecture 11: Max-margin learning and siamese networks slides.pdf
- Lecture 12: Recurrent neural networks and LSTMs slides.pdf
- Lecture 15: Reinforcement learning with direct policy search slides.pdf
- Lecture 16: Reinforcement learning with action-value functions slides.pdf
- Practical on week 2: (1) Learning Lua and the tensor library. pdf
- Practical on week 3: (2) Online and batch linear regression. pdf
- Practical on week 4: (3) Logistic regression and optimization. pdf
- Practical on week 6: (4) Feedforward neural networks, and implementing your own layer. pdf
- Practical on week 7: (5) Intro to nngraph for graph-shaped modules. pdf
- Practical on week 8: (6) Training a LSTM language model. pdf
- Class on Week 3: Problem set.
- Class on Week 5: Problem set.
- Class on Week 7: Problem set.
- Class on Week 8: Problem set.
Lectures on Machine Learning (Mark Schmidt)
- Overview
- Exploratory Data Analysis
- Decision Trees (Notes on Big-O Notation)
- Fundamentals of Learning (Notation Guide)
- Probabilistic Classifiers (Probability Slides, Notes on Probability)
- Non-Parametric Models
- Ensemble Methods
- Least Squares (Notes on Calculus, Notes on Linear Algebra, Notes on Linear/Quadratic Gradients)
- Nonlinear Regression
- Gradient Descent
- Robust Regression
- Feature Selection
- Regularization
- More Regularization
- Linear Classifiers
- More Linear Classifiers
- Feature Engineering
- Convolutions
- Kernel Methods
- Stochastic Gradient
- Boosting
- MLE and MAP (Notes on Max and Argmax)
- Principal Component Analysis
- More PCA
- Sparse Matrix Factorization
- Recommender Systems
- Nonlinear Dimensionality Reduction
- Structure Learning
- Sequence Mining
- Tensor Basics
- Semi-Supervised Learning
- PageRank
- Markov Chains and Monte Carlo
- Structured Prediction Motivation
- Density Estimation
- Multivariate Gaussians
- Mixture Models
- Generative Classifiers
- Expectation Maximization (Notes on EM)
- Kernel Density Estimation
- Probabilistic PCA, Factor Analysis, Independent Component Analysis
- Markov Chains
- Monte Carlo Methods
- Message Passing
- Hidden Markov Models
- DAG Models
- More DAGs
- Undirected Graphical Models
- Approximate Inference
- Log-Linear Models
- Boltzmann Machines
- Conditional Random Fields
- Structured SVMs
- Bayesian Statistics
- Empirical Bayes
- Conjguate Prios
- Hierarchical Bayes
- Topics Models
- Rejection/Importance Sampling
- Metropolis-Hastings
- Variational Inference
- Non-Parametric Bayes
- Infinite Mixture Models
- Neural Networks
- Double Descent Curves
- Deep Structured Models
- Fully-Convolutional Networks
- Recurrent Neural Networks
- Long Short Term Memory
- Faster Algorithms for Deep Learning?
- VAEs and GANs
- Attention
- Convex Optimization (Notes on Norms)
- Gradient Descent Progress (Notes on Convexity Inequalities, Notes on Implementing Gradient Descent)
- Gradient Descent Convergence
- Linear and Superlinear Convergence
- Subgradient Methods
- Projected-Gradient
- Proximal-Gradient
- Structured Regularization
- Coordinate Optimization
- Mirror Descent and Multi-Level Methods
- Randomized Algorithms
- Stochastic Subgradient
- Variance-Reduced Stochastic Gradient
- Kernel Methods and Fenchel Duality
- Online Learning
- Over-Parameterized Models
- Parallel and Distributed Machine Learning
- Online, Active, and Causal Learning
- Reinforcement Learning
- Overview of Other Large/Notable Topics
Lecture Notes on Data Analysis, Statistics, and Machine Learning (Leland Wilkinson)
- Introduction
- Data
- Visualizing
- Exploring
- Summarizing
- Distributions
- Inference
- Predicting
- Smoothing
- Time Series
- Comparing
- Grouping
- Reducing
- Learning
- Anomalies
- Analyzing