Skip to content

psnegi/ml_s2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning

  • Course: COMP 4432-1, Class time: Mon, Wed 5- 6.50, Engineering and Computer Science - 410
  • Instructor: Pooran Singh Negi, pooran.negi@gmail.com office 470, Office Hours: T, Th, 3.00 p.m. - 4.30 p.m. Email for 1-on-1 help.
  • (Head TA: Lombe Chileshe (lombe.chileshe@du.edu), Office ECS 358, )
  • TA: Daniel Parada(daniel.parada1@gmail.com): Office ECS 358, M, Tue, W 8-11 a.m )
  • TA: Nidhi Madabhushi, (nidhi.madabhushi@du.edu) Office ECS 358, M, W, 3-5 p.m )

Credit: Content on this page contain links to various external resources and images form Kevin Murhopy book Machine Learning: a Probabilistic Perspective by Kevin Patrick Murphy.

Prerequisite

  • Linear algebra, probability, statistics,
  • optimization and programming experience in python and its scientific libraries.
  • linear algebra overview
  • Read chapter 2 of Kevin Murphy for probabilty and statistics review or any other text you have used in the past

TextBooks

More resources

Linear algebra

YOutube resources

discriminative vs generative models

Deep learning

Course Description

We will go through theory behind machine learning using tool from probability, linear algebra and optimization. We will use python, its scientific libraries (numpy, scipy, matplotlib, Pandas etc.) and scikit-learn: Machine Learning in Python during the course. For deep neural network part, we will use highly popular tensorflow Machine Intelligence library from the Google. For assignments, starter code or hint will be given. At the end of the course, one would have a unifying probabilistic perspective for most of the machine learning algorithms, be comfortable using open source tools for building machine learning systems.

Software

There are couple of choice for running the code for this class Number 1 is the most straight forward option and supports lot of scientific python including tensorflow and keras for deep learning.

  1. Google colab. https://colab.research.google.com/notebooks/welcome.ipynb
  2. or Please install Anaconda Distribution. See the youtube link Installing Anaconda, Jupyter Notebook. For Deep Neural networks, we will go over tensorlfow and keras installation instruction later in the course.

Deep learning Tensorflow and Keras resources.

Python learning resources

Syllabus

This syllabus is subject to change at the discretion of the instructor Here are the main topics for the class. More topics can be added as per class interest and available time.

  • Basic idea of machine learning, and probability
  • Generative models, parametric estimation and supervised learning.
    • Naive Bayes classifier etc.
  • Gaussian models
  • Linear and logistic regression
  • Support vector machine, Kernels
  • Decision tree.
  • Probabilistic graphical model.
  • Bias-Variance tradeoff and model selection etc.
  • Ensemble methods, bagging and boosting
  • Unsupervised learning
    • Clustering, topic modelling etc.
  • Deep learning
    • Artificial Neural Networks(ANN), End to end learning, cost function
    • Convolutional Neural Networks(CNN) for classification(image) and regression
    • Recurrent Neural Networks for natural language processing(NLP) and time series data
    • Generative adversarial networks (GANs)

Grading

There will be one mid term, a final exam, homework assignments, in class quizzes. A final machine learning related project and presentation will be due at the end of the quarter. We’ll drop one of your worst homework assignment and quiz grade. We’ll allow 2 late homework with cutoff of 36 hours. We’ll give

ceil(total_marks_obtained*exp(-(minutes late)/(24*60))) marks

for late submitted assignments via email.

Homework + Quizzes35(25 + 10) %
Midterm exam, Time 22 July, in class, close book and notes20%
Final exam comprehensive, 14 th August, in class close books and notes27%
I have to cancel extra class on Friday 16 th August
ML competition, notebook submission 17 August 11.59 p.m18%

grade range [(‘A’, >=93), (‘A_minus’, >=89), (‘B_plus’, >=85), (‘B’, >=81), (‘B_minus’, >=77), (‘C_plus’, >=73), (‘C’, >=69), (‘C_minus’, >=65), (‘D_plus’, >61), (‘D’, >=57), (‘D_minus’, >=53), (‘F’, < 53)])

Please respect DU Honor Yourself, Honor the Code

Quiz

quizsol
1sol
2sol
3sol

Midterm

Midtermsolution
practice midtermsol

Homework

Homework numbers are as per Kevin Murphy ebook from the library

Note that we will merge part a and b of homeworks to create a final grade for homeworks. i,e HW1a amd HW1b will be merged to create HW1 for recording final grade of HW1

HW Due date sol
1 1a coding part: python_numpy questions 3rd July 11.59 p.m
1 1b written part: Problem numbers are from kevin murphy book. Use DU library version.
submit written solution: Chapter 2, 2.1(use bayes rule, condition on event actually observed. 5 th July 11.59 p.m
like in part a say N_b = number of boys, N_g no of girls) (2 = 1+1 point), 2.3 (.5 point), 2.4(1 point),
2.6(1 = .5+.5 point), 2.16(1.5= .5+.5+.5 points)
Look for chapter 2 for definitions like section 2.2.4 for
Independence and conditional independence. Explain various steps in the work
2 2b Chpater 2, 2.13 (1 point, hint: I(X,Y) = H(X) + H(Y) - H(X,Y)) 12 th July 11.59 p.m
chapter 3, 3.6 (1 point), 3.7(1 point each), 3.11(.5 point each), 3.20(.5 point each),
2 2a implementing naive bayes airlines sentiment 22 th July 11.59 p.m
3 3a implementating QDA notebook 24 th July 11.59 p..m
3 3b Q1 (2 point)- Prove that If $Σ_c$ (covariance matrix for class c) is 20 th July 11.59 a.m
diagonal, then Gaussian discriminant analysis is equivalent to naive Bayes.
From the book 4.1 (1 point )(look into section 2.5.1 for definition of
correlation coefficient), 4.14(2 point .5 points each)
4.21(2 = 1 + 1 point ), 4.22(1 = .5+.5 point),
4 a linear ridge regression using tensorflow 31 July 11.59
4 b (2 points) From the book using equations 7.30, 7.31 derive equation 7.32(ridge regression) 2 August 11.59 p.m
7.2 (1 point)(check the formula for W in the book. X transpose is missing)
7.4 (2 point), 7.9 (2=1.5+.5 points), 8.3(2 = .5 + 1.5 + 1 points )
5 a tensorflow multiclass logistic regression 8 th August 11.59 p.m
5 b LDA PCA 10 th August 11.59 pm
ml competition notebook and sample code 16 th August 11.59 p.m
6 a HW6a SVM sklearn questions 15 th August 11.59 p.m
6 b Written homework 13 th August 11.59 sol
for first part consider 1x1 Gram matrix.
Find vector(can take any dimension >2 vectors x, y such that $tanh(x^Ty)$ is negative
Note = Dimension 1 vectors(scalar) is ok but trivial

Course Lectures

DateRequired Reading assignmentuploaded slides/notebooks
24 JuneRead chapter 1 of Kevin Murphy and Basic of probability from chapter 2 upto 2.4.1 and 2.4.6Review basic linear algebra, notion of do product and similarity. This is very fundamental and we’ll use it a lot.
Detail Scipy Lecture Notes . Practice 1.3.1 and 1.3.2, 1.4.1 to 1.4.2.8 in Jupyter notebookproperties of vectors, matrices and connection between them, notion of linear combinations and spanned space.
Reviewed common discrete random variables.
Review assignment about eigen value and vectors, SVD, positive definite matrices from your linear algebra notes.
continuous distributions like normal, multi-variate normal, beta, dirichlet .
26 Junesection 2.2, 2.3, 2.4[.1, .2, .3, .4, .5, .6], 2.5[.1, .2, .4], 2.6.1, 2.8 of kevin MurphyBasic machine learning categories. Generative classifiers.
3.1-3.2.4Bayesian concept learning.
ml motivation notebook
numpy basic notebook
generative models notebook
1 st Julyinformation theory, beta dist, mle, map
3rd JulyRest of chapter 3MLE and MAP estimation of parameters, selection of prior
Here is the link to mechanics of Lagrangian multiplier. For more detail see
This link at metacademy. Go over free section.
If you want to go over optimization theory in detail
here is the link to the book by prof. Stephen Boyd and Lieven Vandenberghe.
Checkout the Stanford related link.
8 th Julyk. M. book 4.1 upto 4.2.5
MVN demo
10 th JulyCovered modelling class-conditional densities using multi-variate Gaussian distribution(Gaussian discriminant analysis, QDA,LDA)
Idea of decision boundary and discriminant function.
polynomial fitting issue
15 th JulyK.M. book 7.1- 7.3.3, 7.5.1
Started Linear model, MLE estimation of parameters.
Here is the link to Psedo-Inverse I talked about Least Squares, Pseudo-Inverses SVD
17 th July8.1, 8.2, 8.3.1-8.3.3Linear model. MAP estimate. Gaussian prior(Ridge regression), Laplace prior(LASSO). Geometric interpretation.
How linear model can be extended to nonlinear model and polynomial fitting issue.
Convex sets and functions. Started discriminative models(logistic model …)
22 nd JulyIn class, close notebook Midterm
24 th JulyTensorflow overview
tensorflow examples
Optional reading
- Understanding Learning Rates and How It Improves Performance in Deep LearningVisualizing the Loss Landscape of Neural Nets
- Snapshot Ensembles: Train 1, get M for free
- Intro to optimization in deep learning: Momentum, RMSProp and AdamFinished MLE for logistic regression,
Gradient descent, stochastic gradient descent
mini-batch gradient descent in the context of convex and non-convex loss function optimization.
Some issue like getting out of local minima and handling saddle points
Started tensorflow for building machine learning models.
29 th July8.3.6, 8.3.7, 8.6.3- 8.6.3.2Finished multiclass logistic regression, and PCA
31 July12.2Finished Fisher LDA and and started kernel
5th August8.6.3, 14.1 - 14.5KNN link. Covered kernel and classification and regression SVM. Please go through
kernel ridge regression, kernel PCA and classification SVM in the book.
Please read soft margin svm from book.
This is the paper we talked about in the context of XOR problem. This paper is not related to the coursework.
It is optional. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
7 th AugustK. M. book 5.7 upto 5.7.2.2Bayes decision theory notebook
Covered Bayesian decision theory, confusion matrix, issue with accuracy
and idea of recall, precision and merging them(F1, Fb score) and ROC(AUC) curve
12 th AugustBias Varaince tradeoff
ANN
Look into these resources too
Chapter 16 Adaptive basis function models(decision tree, Random forest, Boosting(AdaBoost), Ensemble learning) etc.
Chapter 25 Clustering(Should be covered in data mining)
Chapter 11 Mixture models and the EM algorithm (Can be covered in data mining)
SVM dual, kernels and regression
SVMs, Duality and the Kernel Trick

About

machine learning, probabilistic perspective

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published