 <div>
<img src="https://edlitera-images.s3.amazonaws.com/new_edlitera_logo.png" width="500"/>
</div>

# Classification measures (visual example)

## Dataset

* we have five data points of one class and five data points of another class displayed on the image below

    <br>
    
    * one class is represented with the color blue
    * one class is represented with the color red


<img src="https://edlitera-images.s3.amazonaws.com/binary_classification_starting_image.png" width="350">
<p><center><b>Image 1</b></center></p>

<Br>

* we want to train our model to classify data into one of these two classes
    <br>
    
    * this is a binary classification problem
    * the model is basically being trained to draw a line that will separate the two classes in the image

* to measure how well our model performs, we will use:
    <br>
    
    * precision
    * recall
    * F1 score

## _Example 1: good model_


<img src="https://edlitera-images.s3.amazonaws.com/binary_classification_example_1.png" width="350">
<p><center><b>Image 2</b></center></p>

## Blue

### $Recall = \frac{5}{5} = 1$

* 5 blue dots exist in the dataset
* our model managed to draw a line that will include all of the blue dots in the blue dot class
* therefore, the recall is equal to the number of blue dots we classified correctly (5), divided by the actual number of blue dots in the dataset (5)
* that is how we get a recall of 100%


### $Precision = \frac{5}{6} ≈ 0.83 $

* the line divides our data so that 6 dots were assigned the blue class label
* however, out of those 6 only 5 are actually blue dots - there is one impostor red dot
* therefore, the precision is equal to the number of correctly classified dots (5), divided by the total number of dots assigned to that label (6)
* we are going to round it to 83 %


### $F1 = \frac{2}{\frac{1}{1} + \frac{1}{0.83}} = \frac{2}{1 + 1.2} = 0.91 $

* following the formula for F1, and inserting the values we calculated for precision and recall we get an F1 score of 0.91

## Red

### $Recall = \frac{4}{5} = 0.8$

* 5 red dots exist in the dataset
* our model drew a line that will NOT include all of the red dots in the red dot class
* therefore, the recall is equal to the number of red dots we classified correctly (4), divided by the actual number of red dots in the dataset (5)
* that is how we get a recall of 80%


### $Precision = \frac{4}{4} = 1 $

* the line divides our data so that 4 dots were assigned the red class label
* out of those 4 all 4 are red dots 
* therefore, the precision is equal to the number of correctly classified dots (4), divided by the total number of dots assigned to that label (also 4)
* this is how we get a precision of 100 %


### $F1 = \frac{2}{\frac{1}{0.8} + \frac{1}{1}} = \frac{2}{1 + 1.25} ≈ 0.89 $

* following the formula for F1, and inserting the values we calculated for precision and recall we get an F1 score of approximately 0.89

## _Example 2: bad model_


<img src="https://edlitera-images.s3.amazonaws.com/binary_classification_example_2.png" width="350">
<p><center><b>Image 3</b></center></p>

## Blue

### $Recall = \frac{2}{5} = 0.4$

* 5 blue dots exist in the actual dataset
* our model drew a line that will include only 2 blue dots in the blue dot class
* therefore, the recall is equal to the number of blue dots we classified correctly (2), divided by the actual number of blue dots in the dataset (5)
* that is how we get a recall of 40%


### $Precision = \frac{2}{2} = 1 $

* the line divides our data so that 2 dots were assigned the blue class label
* both dots are blue dots 
* therefore, the precision is equal to the number of correctly classified dots (2), divided by the total number of dots assigned to that label (2)
* this is how we get a precision of 100 %


### $F1 = \frac{2}{\frac{1}{0.4} + \frac{1}{1}} = \frac{2}{2.5 + 1} = 0.57 $

* following the formula for F1, and inserting the values we calculated for precision and recall we get an F1 score of 0.57

## Red

### $Recall = \frac{5}{5} = 1$

* 5 red dots exist in the actual dataset
* our model drew a line that will include all of them in the red dot class
* therefore, the recall is equal to the number of red dots we classified correctly (5), divided by the actual number of red dots in the dataset (also 5)
* that is how we get a recall of 100%


### $Precision = \frac{5}{8} = 0.625 $

* the line divides our data so that 8 dots were assigned the red class label
* there are 5 red dots and 3 blue dots 
* therefore, the precision is equal to the number of correctly classified dots (5), divided by the total number of dots assigned to that label (8)
* this is how we get a precision of 62.5%


### $F1 = \frac{2}{\frac{1}{1} + \frac{1}{0.625}} = \frac{2}{1 + 1.6} = 0.77 $

* following the formula for F1, and inserting the values we calculated for precision and recall we get an F1 score of 0.77

<br>
<br>
<br>
<br>
<br>
<br>
<br>

# Classification scores in multiclass classification

* calculating classification measures for multiclass classification is not very different from calculating them for binary classification

* the change in the number of classes changes the confusion matrix somewhat, and calculating the F1 score for the whole model is a bit different

* in terms of visualization, a new axis gets added
    <br>
    
    * the dimensionality increases by one dimension each time we add a class

**Example:**

* for a multiclass problem with 3 classes we will have our data stored in 3D space
* instead of finding a line that separates data, the model tries to find a plane that will separate the data

## _Confusion matrix changes_

* the main difference is in the size of the matrix itself

* each new class adds another row and column to the matrix

* the procedure we go through is very simple:
    <br>
    
    * calculate the precision and recall of each class separately
    * calculate the F1 score of each class separately
    * **then we calculate an averaged F1 score to get the overall F1 score of the whole model**

### Multiclass confusion matrix


<img src="https://edlitera-images.s3.amazonaws.com/multiclass_confusion_matrix.png" width="400">
<p><center><b>Image 4</b></center></p>

**Precision and recall made simple:**

* you can use the values in the multiclass confusion matrix to calculate the precision / recall for each class (see below)

## Class A


<img src="https://edlitera-images.s3.amazonaws.com/class_A_precision_and_recall.png" width="400">
<p><center><b>Image 5</b></center></p>

### $Precision = \frac{20}{20+3+0} = 0.87 $


 

### $Recall = \frac{20}{20+26+20} = 0.3 $


 

### $F1 = \frac{2}{1.15+3.3} = 0.45 $


 

## Class B


<img src="https://edlitera-images.s3.amazonaws.com/class_B_precision_and_recall.png" width="400">
<p><center><b>Image 6</b></center></p>

### $Precision = \frac{7}{26+7+0} = 0.21 $


 

### $Recall = \frac{7}{3+7+2} = 0.58 $


 

### $F1 = \frac{2}{4.76 + 1.72} = 0.31 $


 

## Class C


<img src="https://edlitera-images.s3.amazonaws.com/class_C_precision_and_recall.png" width="400">
<p><center><b>Image 7</b></center></p>

 ### $Precision = \frac{5}{20+2+5} = 0.19 $


 

### $Recall = \frac{5}{5 + 0 + 0} = 1 $


 

### $F1 = \frac{2}{5.26 + 1} = 0.32 $


 

## _Averaged F1 scores_

* once we have calculated the per-class F1-scores we can combine them to get the overall F1 score of the model

* there are a few variants of the averaged F1 score, but the two most popular are:
    <br>
    
    * ***macro averaged F1 score***
    * ***weighted average F1 score***

### Macro-averaged F1 score

* basically the arithmetic mean of the F1 scores

### $\text {Macro averaged F1} = \frac{0.45+0.31+0.32}{3} = 0.36 = 36\text{%}$

### Weighted average F1 score

* we don't need to assign equal weights to each class
    <br>
    
    * this is what we do when we calculate the Macro averaged F1 score

* in the case of the weighted average F1 score, we weigh the individually calculated F1 scores based on the number of samples in each class

* in our case we have a total of 83 samples
    <br>
    
    * you get the number of samples per class by summing up the values in the columns
        <br>
        
        * we have 66 samples of class A
        * we have 12 samples of class B
        * we have 5 samples of class C

* finally, to get the weighted F1 score we multiply the F1 score of each class with the number of samples we have from that class, sum the results of all the classes and divide by the total number of samples

### $\text {Weighted average F1} = \frac{66*0.45+0.31*12+0.32*5}{83} = \frac{29.7+3.72+1.6}{83} = 0.42 = 42\text{%}$

 <div>
<img src="https://edlitera-images.s3.amazonaws.com/new_edlitera_logo.png" width="500"/>
</div>