# ML Models: Classification

In the previous course, we saw linear regressions that allow you to predict a number (such as someone's salary). Classifications are another type of model that will help you predict a category.

## What you will learn in this course 🧐🧐

- What are classification models
- Building a logistic regression model
- What are false positives and false negatives
- Evaluating the performance of your model using confusion matrices

## Logistic Regression

### Definition 📖

Unlike linear regressions, which predict a number, classification models predict a category. For example, if you are trying to predict whether someone will buy a product from you based on certain independent variables, you are in a classification problem because the categories you are trying to predict are "yes, the person will buy the product" or "no, the person won't buy my product".

Logistic regressions are one category in classification models but you have many others like _decision trees_, _SVM (support vector machine)_ or Naive Bayes.

### Equation 👩‍🔬

#### Logistic Regression

![](https://essentials-assets.s3.eu-west-3.amazonaws.com/M04-Machine-learning/Logistic_regression.png)

When we did the linear regressions, we saw that our predictor was the line drawn by our model. In a logistic regression, the line is simply a boundary that separates two categories. In the graph above, we are trying to see whether a person will buy a product (represented by the number 1) or not buy a product (represented by the number 0). The line represents the probability that a person will buy (_purchased_) or not buy a product based on their Age.

Mathematically, the logistic regression is based on the Logistic function:

$$ y = \frac{1}{1+e^{-k(x-x_0)}} $$

Where: 

* $y$ is your target variable expressed in probability (i.e a number from 0 to 1)
* $k$ is the steepness of the curve (i.e how fast the curve goes from the bottom to upper values)
* $x$ is your feature variable 
* $x_0$ is the curve midpoint (usually 50%)

In this example, we have only one independent variable and one constant. The equate is slightly more complex but it is simply to represent categories as probabilities, that's it. Instead of having only `0` and `1`, you will have values from `0` to `1` with any value in this interval. Depending on the probability obtained, the algorithm will know in which category to place our individual.

## Log Loss 📉

### Definition 

As you know, in Machine Learning, we always try to minimize what we call a cost function. In the case of logistic regression , we use the *Log Loss* function that looks like this:

$$ Log Loss = \sum_{i=0}^n -y_ilog(\hat{y_i}) - (1-y_i)log(1-\hat{y_i}) $$

Where: 

* $y_i$ is your actual target value 
* $\hat{y_i}$ is your model's prediction 

This formula is a little more complicated and you don't really need to understand it at all. The whole idea behind it is to make sure the algorithm predictions get closer to the actual target value. 


#### How to make our classification

Now that we have drawn the line, we can begin our interpretations. Since our model this time is probabilistic, data points with a probability greater than 50% will belong to category A, while points with a probability less than 50% will belong to category B.

Let's take an example. Based on some independent variables, we have found that person A has a 60% chance of buying the product. He will therefore be considered a "buyer" for our model. On the other hand, if this time we have a person B who has only a 45% chance of buying the product, he will be considered a "non-buyer".

### False Positives & False Negatives 😈

Since our model is based on probability, it can be wrong sometimes. False positives and false negatives represent the errors in our model.

#### False Positive

Let's continue with the example from above. If our model categorizes person A as a "buyer" and that person in reality does not buy the product then we are dealing with a false positive. The model expected a positive result, which in the end did not happen.

#### False Negative

Conversely, if person B, whom the model predicted as a non-buyer, finally buys the product, this is a false negative. We predicted a negative result but it didn't happen...

#### Pay attention to false positives and ESPECIALLY to false negatives.

Be aware of false positives and negatives because a prediction error can have more or less serious consequences depending on what you are trying to predict. For example, if you are a scientist trying to predict an earthquake and you come across a false negative (i.e. you predicted that the earthquake would not happen when it did), no one was actually prepared for the event.

Generally speaking, false negatives are worse than false positives because in the first case no one is prepared for the event to happen. In the second case, you are prepared for it, and even if it doesn't happen, it's not the worst.

### How to evaluate your model

#### Confusion matrix

![](https://essentials-assets.s3.eu-west-3.amazonaws.com/M04-Machine-learning/confusion_matrix.png)

One of the quick and easy ways to measure the performance of your model using confusion matrices. The idea is to see the predictions that your model was right as well as the false positives and false negatives. By summing the errors on the prediction total you have the accuracy rate of your model.

## Resources 📚📚

- Implementing Logistic Regression - [https://bit.ly/2FFUjAn](https://bit.ly/2FFUjAn)
- Logistic Regression - [http://bit.ly/2bdDELb](http://bit.ly/2bdDELb)
- Summary of Probability - [http://bit.ly/2m8YgDR](http://bit.ly/2m8YgDR)
- Confusion Matrix - [http://bit.ly/2xApsRz](http://bit.ly/2xApsRz)
- False Positives & False Negatives - [http://bit.ly/2FmhMql](http://bit.ly/2FmhMql)