![Supervised learning banner](./images/0_supervised_learning_banner.png)

# 0. Supervised Learning

A supervised learning algorithm takes a known set of (input) data and known responses (labels) to the data (output) and trains a model to generate reasonable predictions for the response to new data.

Supervised learning uses classification and regression techniques to develop predictive models.

![Supervised learning techniques](./images/0_supervised_learning_techniques.png)

#### What is unsupervised learning?

In unsupervised learning, the algorithm is given unlabeled data as a training set. Unlike supervised learning, there are no correct output values; the algorithm determines the patterns and similarities within the data, as opposed to relating it to some external measurement.

Note that the main difference happens during the training stage - hence "learning".

![Supervised Learning vs Unsupervised](./images/0_supervised_learning_vs_unsupervised.png)

## Regression vs Classification

![Supervised Learning Classification vs Regression](./images/0_supervised_learning_classification_vs_regression.png)

#### Regression models predict a continuous response. 

Typical applications include algorithmic trading, electricity load forecasting, fluctuations in power demand, sales forecasting.

#### Classification models classify input data into categories.

Typical applications include medical imaging, image and speech recognition, credit scoring and spam detection.

Note: For reference, **unsupervised learning** finds hidden patterns or intrinsic structures in data. It is used to draw inferences from datasets consisting of input data without labeled responses. Typical applications include exploratory data analysis to find hidden patterns or groupings in data.

## Table of Contents

1. Data Collection
   * 1.1\. Data Sources
   * 1.2\. Data Collection Considerations
2. Data Exploration and Preparation
   * 2.1\. Data Exploration
   * 2.2\. Data Preparation/Cleaning
3. Split Data into Training and Test Sets
   * 3.1\. Holdout Method
   * 3.2\. Cross Validation
   * 3.3\. Data Leakage
   * 3.4\. Best Practices
4. Choose a Supervised Learning Algorithm
   * 4.1\. Consider algorithm categories
   * 4.2\. Evaluate algorithm characteristics
   * 4.3\. Try multiple algorithms
5. Train the Model
   * 5.1\. Objective Function (Loss/Cost Function)
   * 5.2\. Optimization Algorithms
   * 5.3\. Overfitting and Underfitting
6. Evaluate Model Performance
   * 6.1\. Performance Metrics for Regression Models
   * 6.2\. Performance Metrics for Classification Models
7. Model Tuning and Selection
   * 7.1\. Hyperparameter Tuning
   * 7.2\. Ensemble Methods