<h1 style='color: #C9C9C9'>Machine Learning with Python<img style="float: right; margin-top: 0;" width="240" src="../../Images/cf-logo.png" /></h1> 
<p style='color: #C9C9C9'>&copy; Coding Fury 2022 - all rights reserved</p>

<hr style='color: #C9C9C9' />

# Supervised Learning

Supervised Machine Learning falls into two categories: 

1. Regression
2. Classification


![Supervised Machine Learning](../../Images/machine-learning-types.png)

Imagine that you are a chef and you want to know what the best way to cook a Pork Tenderloin is. You embark on an experiment that is similar to the kind you used to do in GCSE Science class. 

The Pork Tenderloin is placed in the oven at 350℉. A probe is placed into the Tenderloin so that you can monitor it's internal temperature as it cooks in the oven. 

![Pork Tenderloin: Raw Data](../../Images/pork-raw-data.png)

In this case the **target variable** will relate to how well cooked the pork is. 

Supervised Machine Learning can help you analyse this data in several ways. We're going to discuss three.

1. Linear Regression
2. Binary Classifier
3. Multi-class classifier


### Linear Regression

Linear Regression is essentially the line of best fit through the points you have collected. It would look like this:

![Pork Tenderloin: Linear Regression](../../Images/pork-linear-regression.png)

Linear Regression allows you to interpolate - that is predict values of the internal temperature of pork *between* the points you already have. For example you could predict the temperature the core of the pork tenderloin will be at after 60 minutes, despite the fact that you don't have a temperature reading at exactly this time. 

Linear Regression allows you to extraplolate - that is predict values of the internal temperature of pork *beyond* the points you already have. For example you could predict the temperature the core of the pork tenderloin will be at after 90 minutes, despite the fact that you took the last temperature reading at 84 minutes. 

The point here is that **Regression** can be used produce a **continuous** range of values for the target variable, temperature.


### Binary Classification

A binary Classifier could could be used to classifify the pork as either Cooked or Raw, for a given amount of time in the oven. 


![Pork Tenderloin: Binary Classifier](../../Images/pork-binary-classifier.png)

If you use this model, you won't be able to predict the internal temperature of the meat, but you will be able to say if it's cooked or not. 



### Multi-Class Classifier

A multi-class classifier could be used to predict **how well** cooked the meat is for a given amount of time in the oven.

![Pork Tenderloin: Multi-Class Classifier](../../Images/pork-multi-classifier.png)

If you use this model, you won't be able to predict the internal temperature of the meat, but you will be able to say how well cooked it is. 



## Glossary of terms: 
- **Features**: columns of input data that are used to determine the value of a target variable. These are sometimes referred to as "dimensions".
- **Target Variable**: think of this as an empty column in your dataframe, that will hold the value you are trying to determine.
- **Labelled Data**: Supervised learning requires a training set where the target variable is known. This is known as "labelled data".
- **Observations**: Each row in the dataset is a new "observation"


## Necessary Data

Let's consider the necessary training data for each model.

1. Linear Regression
2. Binary Classifier
3. Mutli-Class Classifier


### Linear Regression - Training Data

For Linear Regression the data we'd use to train our model comprises: 
* 1 "Feature" column: the time 
* the "target value" column
  - a continuous value column (temperature) 
* 18 observations

|Time (feature)|Temperature (target)|
|----|-----------|
|2   |18         |
|8   |23         |
|13  |28         |
|17  |35         |
|22  |46         |
|26  |55         |
|29  |74         |
|33  |92         |
|38  |102        |
|40  |116        |
|44  |131        |
|49  |139        |
|55  |150        |
|59  |161        |
|64  |171        |
|71  |178        |
|77  |186        |
|84  |190        |


### Binary Classifier - Training Data

For a Binary Classifier the data we'd use to train our model comprises: 

* 1 "Feature" column: the time 
* the "target value" column 
  - a categorical column with 2 values (cooked or uncooked)
* 18 observations


|Time (feature)|Status (target)|
|----|------|
|2   |uncooked|
|8   |uncooked|
|13  |uncooked|
|17  |uncooked|
|22  |uncooked|
|26  |uncooked|
|29  |uncooked|
|33  |uncooked|
|38  |uncooked|
|40  |uncooked|
|44  |uncooked|
|49  |cooked|
|55  |cooked|
|59  |cooked|
|64  |cooked|
|71  |cooked|
|77  |cooked|
|84  |cooked|


### Multi-Class Classifier - Training Data

For a Multi-Class Classifier the data we'd use to train our model comprises: 

* 1 "Feature" column: the time 
* the "target value" column
  - a categorical column with 6 succinct values regarding how well the meat is cooked
* 18 observations

|Time (feature)|Status (target)|
|----|------|
|2   |uncooked|
|8   |uncooked|
|13.5|uncooked|
|17  |uncooked|
|22  |uncooked|
|26  |uncooked|
|29  |uncooked|
|33  |uncooked|
|38  |uncooked|
|40  |uncooked|
|44  |medium rare|
|49  |medium|
|55  |medium well|
|59  |well done|
|64  |overcooked|
|71  |overcooked|
|77  |overcooked|
|84  |overcooked|


# Exercise

## Improving the model

Obviously this model is over simplified. It could be improved in various ways: 

1. Collecting more rows of data i.e. "observations".
    - This would involve cooking more pork tenderloins and recording the results.
2. Adding more columns to the dataset. i.e. "features"
    - Using your "domain knowledge" of the subject, what other variables do you think might impact the cooking time of the meat?

As a chef, which of the three ML models do you think would be best? Note that while temperature and the cooking status of the mean could both be useful, you can only have one target variable. 

**Question:**

Use Excel to create a template of the data you'd like to collect. Clearly identify the features and target variable by colour-coding the spreadsheet.


## Conclusion

It turns out that cooking a pork tenderloin has lots of variables. That's probably the reason why this wasn't a GCSE Science experiement that you carried out in school! 

How does your spreadsheet look? Hopefully you remembered that in Machine Learning there can only be one Target Variable?

Thinking back to your days at school, most of the science experiments you carried out only had 2 variables that were easily plotted on an X & Y axis. 

Consider the Hooke's law example from earlier it had 2 variables, plotted on an x and y axis. 

![Hooke's Law Graph](../../Images/hookes-law-graph.png)


### How many Dimensions?

Features are sometimes called dimensions. So if I talk about dimensions from here on, I really mean "input" dimensions.  You can think of the target variable as being the output. 

You might think of the chart above as having 2 dimensions, an x and y axis. 

However, to avoid confusion, I want you to consider that the that we 1 input dimension and 1 target value.

**Feature:** The weight added to the spring (1 input dimension)
**Target Value:** The extension of the spring

So plotting 1 dimension of (input) data looks like this: 

![Plotting 1 Dimension](../../Images/plotting-1-dimension.png)

You might immediately assume that plotting an extra dimension would require an X, Y and Z axis. However, we can plot the target variable as a colour on the chart.

In the the example below we can measure 2 parts of an iris flower (2 input dimensions) and infer the species (target variable). 

![Plotting 2 Dimensions](../../Images/plotting-2-dimensions.png)

If we add measurements from another part of the flower to the plot, we now need the Z axis.

![Plotting 3 Dimensions](../../Images/plotting-3-dimensions.png)


Most of what we're taught about analysing data in school revolves around visualising the data first, and looking for trends. But what if we want to work with more dimensions of data, like the pork tenderloin? 

Unfortunately it's not easy to plot more dimensions of data on a chart in any meaningfuy way, but it turns out that the machine learning algorithms we're about to untilise allow us to input as many dimensions as we like. Mathematically the algorithms work the same if there are 2 dimensions of input data or 10 dimensions, it doesn't matter. 




