Here presented most popular classification metrics with their calculation formulas and may be some comments on it

[article with good analytics for Acc, Prec, Rec](https://www.evidentlyai.com/classification-metrics/accuracy-precision-recall)

# Binary classification

In binary classification metrics calculation confusion matrix is often used. So, here it is:

<div>
<img src="https://i.ytimg.com/vi/eQrkcFUWhJI/maxresdefault.jpg" width="500"/>
</div>


- ***TP*** (True positives) - objects of the ***positive*** class, classified ***correctly***
- ***FP*** (False positives) - objects of the ***positive*** class, classified ***incorrectly***
- ***FN*** (False negative) - objects of the ***negative*** class, classified ***incorrectly***
- ***TN*** (True negative) - objects of the ***negative*** class, classified ***correctly***


## Accuracy

Accuracy - number of correctly classified objects devided by number of objects
$$Accuracy = \frac{TP+TN}{TP+FP+FN+TN}$$ 


## Precision/Recall
***Precision*** -  measures how often a machine learning model correctly predicts the positive class (among all positive prediction)

***Recall*** - measures how often a machine learning model correctly identifies positive instances (true positives) from all the actual positive samples in the dataset

$$Precision = \frac{TP}{TP+FP}$$ 
$$Recall = \frac{TP}{TP+FN}$$ 

## F1-score
***F1 score*** is the ***harmonic mean of precision and recall***, which means that the F1 score will tell you the ***model’s balanced ability to both capture positive cases (recall) and be accurate with the cases it does capture (precision)***

$$F1 = 2\frac{Precision*Recall}{Precision+Recall} = \frac{2*TP}{2*TP+FN+FP}$$ 

## $F_{\beta}$
There are situations, when we need to more pay attention to either Recall or Precision. For example, when it comes to cancer diagnosis it's much more important to catch these, who really has a cancer, because missing a cancer diagnosis could be life-threatening. Thus, we need to maximise recall. There are a bunch of other tasks sensitive to specific metric due to it's domain

$$F_{\beta} = (1+\beta^2)\frac{Precision*Recall}{\beta^2Precision+Recall} = \frac{(1+\beta^2)*TP}{(1+\beta^2)*TP+\beta^2*FN+FP}$$ 

- $\beta<0$: precision is more important than recall
- $\beta>0$: recall is more important than precision

## Precision-Recall(PR) Curve
A PR curve is simply a graph with Precision values on the y-axis and Recall values on the x-axis


<div>
<img src="https://www.codecamp.ru/content/images/2021/09/precisionRecall2.png" width="500"/>
</div>

## ROC Curve
plot of the true positive rate (TPR) against the false positive rate (FPR) at each threshold setting.

<div>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/13/Roc_curve.svg/1200px-Roc_curve.svg.png" width="500"/>
</div>


$$TPR = Recall$$

$$FPR = \frac{FP}{TN+FP}$$
***TPR = True positive rate***

***FPR = False positive rate***

## Comparence of metric

Important note:
- ***Type 1 error*** - False Negative
- ***Type 2 error*** - False Positive

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-98ao{border-color:inherit;color:#000C2D;font-weight:bold;text-align:left;vertical-align:bottom}
.tg .tg-za14{border-color:inherit;text-align:left;vertical-align:bottom}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-dusx{border-color:inherit;color:#000C2D;text-align:left;vertical-align:bottom}
.tg .tg-7zrl{text-align:left;vertical-align:bottom}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-98ao">Metrics</th>
    <th class="tg-98ao">Resistence to type 1 errors</th>
    <th class="tg-za14">Resistence to type 2 errors</th>
    <th class="tg-za14">Resistence to imbalanced classes</th>
    <th class="tg-0pky"></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-dusx">Accuracy</td>
    <td class="tg-dusx">No</td>
    <td class="tg-za14">No</td>
    <td class="tg-za14">No</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-dusx">Precision</td>
    <td class="tg-dusx">No</td>
    <td class="tg-za14">Yes</td>
    <td class="tg-za14">Yes</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-dusx">Recall</td>
    <td class="tg-dusx">Yes</td>
    <td class="tg-za14">No</td>
    <td class="tg-za14">Yes</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-dusx">F1-score</td>
    <td class="tg-dusx">Yes</td>
    <td class="tg-za14">Yes</td>
    <td class="tg-za14">Yes</td>
    <td class="tg-0pky"></td>
  </tr>
  <tr>
    <td class="tg-7zrl">F-beta score</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-7zrl">PR-AUC</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-7zrl">ROC-AUC</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-7zrl">Yes</td>
    <td class="tg-0lax"></td>
  </tr>
</tbody>
</table>

# Multy-class classification

There are two main approaches to multy-class classification:

1. ***One-vs-One***: training of K (K − 1) / 2 binary classifiers; each receives the samples of a pair of classes from the original training set, and must learn to distinguish these two classes. At prediction time, a voting scheme is applied: all K (K − 1) / 2 classifiers are applied to an unseen sample and the class that got the highest number of "+1" predictions gets predicted by the combined classifier
2. ***One-vs-Rest***: training a single classifier per class, with the samples of that class as positive samples and all other samples as negatives. Then one chooses the highest score of all models:
$$\hat{y} = argmax_{k\in [1, ..., K]}f_k(x)$$, where K is a number of classifiers, $f_k(x)$ - the score of a k-th classifier on an object x

Most metrics from binary classification can be used for multyclass problem. It can be calculated per every class separately and then be averaged