# Introduction to Machine Learning

## Summary

### This course provides an overview of machine learning fundamentals on modern Intel® architecture. Topics covered include:

* Reviewing the types of problems that can be solved
* Understanding building blocks
* Learning the fundamentals of building models in machine learning
* Exploring key algorithms

### By the end of this course, students will have practical knowledge of:

* Supervised learning algorithms
* Key concepts like under- and over-fitting, regularization, and cross-validation
* How to identify the type of problem to be solved, choose the right algorithm, tune parameters, and validate a model

The course is structured around 12 weeks of lectures and exercises. Each week requires three hours to complete. 

The exercises are implemented in Python*, so familiarity with the language is encouraged (you can learn along the way).


## Prerequisites

* Python* programming
* Calculus
* Linear algebra
* Statistics

## Week 1

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week01.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This class introduces the basic data science toolset: </b>
<br>
<br>&bull; Jupyter Notebook for interactive coding 
<br>&bull; NumPy, SciPy, and pandas for numerical computation 
<br>&bull; Matplotlib and seaborn for data visualization
<br>&bull; Scikit-learn for machine-learning libraries 
<br><br>You’ll use these tools to work through the exercises each week.<br>
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [1]:
%run download.py 1

Button(description='Download', style=ButtonStyle())

Output()

## Week 2

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid  #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week02.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This class introduces the basic concepts and vocabulary of machine learning: </b>
<br>
<br>&bull; Supervised learning and how it can be applied to regression and classification problems 
<br>&bull; K-Nearest Neighbor (KNN) algorithm for classification
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [2]:
%run download.py 2

Button(description='Download', style=ButtonStyle())

Output()

## Week 3

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week03.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This class reviews the principles of core model generalization:</b>
<br>
<br>&bull; The difference between over-fitting and under-fitting a model 
<br>&bull; Bias-variance trade-offs
<br>&bull; Finding the optimal training and test dataset splits, cross-validation, and model complexity versus error
<br>&bull; Introduction to the linear regression model for supervised learning
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [3]:
%run download.py 3

Button(description='Download', style=ButtonStyle())

Output()

## Week 4

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week04.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This class builds on concepts taught in previous weeks. Additionally you will:</b>
<br>
<br>&bull; Learn about cost functions, regularization, feature selection, and hyper-parameters 
<br>&bull; Understand more complex statistical optimization algorithms like gradient descent and its application to linear regression
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [4]:
%run download.py 4

Button(description='Download', style=ButtonStyle())

Output()

## Week 5

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week05.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This class discusses the following:</b>
<br>
<br>&bull; Logistic regression and how it differs from linear regression 
<br>&bull; Metrics for classification errors and scenarios in which they can be used
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [5]:
%run download.py 5

Button(description='Download', style=ButtonStyle())

Output()

## Week 6

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week06.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>During this session, we review:</b>
<br>
<br>&bull; The basics of probability theory and its application to the Naïve Bayes classifier 
<br>&bull; The different types of Naïve Bayes classifiers and how to train a model using this algorithm
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [6]:
%run download.py 6

Button(description='Download', style=ButtonStyle())

Output()

## Week 7

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week07.png" style="zoom:70%;  float:right; padding:0.7em; "/>
<b>This week covers the following topics:</b>
<br>
<br>&bull; Support vector machines (SVMs) — a popular algorithm used for classification problems 
<br>&bull; Examples to learn SVM similarity to logistic regression
<br>&bull; How to calculate the cost function of SVMs
<br>&bull; Regularization in SVMs and some tips to obtain nonlinear classifications with SVMs
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [7]:
%run download.py 7

Button(description='Download', style=ButtonStyle())

Output()

## Week 8

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week08.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>Continuing with the topic of advanced supervised learning algorithms, this class covers:</b>
<br>
<br>&bull; Decision trees and how to use them for classification problems 
<br>&bull; How to identify the best split and the factors for splitting
<br>&bull; Strengths and weaknesses of decision trees
<br>&bull; Regression trees that help with classifying continuous values
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [8]:
%run download.py 8

Button(description='Download', style=ButtonStyle())

Output()

## Week 9

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week09.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>Following on what was learned in Week 8, this class teaches:</b>
<br>
<br>&bull; The concepts of bootstrapping and aggregating (commonly known as “bagging”) to reduce variance
<br>&bull; The Random Forest algorithm that further reduces the correlation seen in bagging models
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [9]:
%run download.py 9

Button(description='Download', style=ButtonStyle())

Output()

## Week 10

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week10.png" style="zoom:70%;  float:right; padding:0.7em"/>
<b>This week, learn about the boosting algorithm that helps reduce variance and bias.</b>
<br>
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [10]:
%run download.py 10

Button(description='Download', style=ButtonStyle())

Output()

## Week 11

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week11.png" style="zoom:70%;  float:right; padding:0.7em"/>
<br>So far, the course has been heavily focused on supervised learning algorithms.
<br>
<br>
<b>This week, learn about unsupervised learning algorithms and how they can be applied to clustering and dimensionality reduction problems.</b>
<br>
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [11]:
%run download.py 11

Button(description='Download', style=ButtonStyle())

Output()

## Week 12

<div class="warning" style='background-color:#EDF2F7; color:#1A2067; border-left: solid #718096 4px; border-radius: 4px;'>
<p style='padding:0.7em; margin-left:0.7em; display: inline-block;'>
<img src="assets/week12.png" style="zoom:70%;  float:right; padding:0.7em"/>
<br>Dimensionality refers to the number of features in the dataset. Theoretically, more features should mean better models, but this is not true in practice. 
<br>Too many features could result in spurious correlations, more noise, and slower performance. 
<br>
<br>
<b>This week, learn algorithms that can be used to achieve a reduction in dimensionality, such as:</b>
<br>
<br>&bull; Principal component analysis (PCA)
<br>&bull; Multidimensional scaling (MDS)
</p>
</div>

<p style="color:red;">To download all the necessary material, run the cell below and click the button if needed.</p>

In [12]:
%run download.py 12

Button(description='Download', style=ButtonStyle())

Output()