# Accuracy

$$accuracy = \frac{\text{no. of correct predictions}}{\text{total no. of predictions}}$$

In [None]:
from sklearn.metrics import accuracy_score

# y_test is truth and y_pred is the prediction
accuracy_score(y_test, y_pred)

**How much accuracy is good?**

It depends on the basis of the problem.

**The problem with accuracy matrix:**

- It does not tell the type of the error. Let's say a model is used to preduct that the student will get the placement or not and accuracy is 90%. So the 10% is not accurate. I dont know that the model is predicted that the student will get the placement but in reality he will not get. And same as the second type where model predicted that the student will not get the placement but in reality he will get. You will never know the types of the inaccuracy types.

# Confusion Metrix

![Type1 & Type2 Error](https://cdn.inblog.in/user/uploads/edZRgLLVcZjPQogeKsmOk4lZq2tF5z.jpg)

**According to Scikit learn, the truth values are to the left side and predicted values are to the upper side**

$$accuracy = \frac{TP + TN}{TP + TN + FP + FN}$$

**Confusion matrix for multi-class classification problem**

| Actual ↓ Predicted → | 0 | 1 | 2 |
| :--: | :--: | :--: | :--: |
| 0 | 7 | 0 | 5 |
| 1 | 2 | 21 | 6 |
| 2 | 9 | 1 | 13 |

**When accuracy is misleading?**
- Imbalanced Dataset

In [None]:
from sklearn.metrics import confusion_matrix

confusion_matrix(y_test, y_pred)

# Precision Metrics

![Spam Classifier](./images/image-8.png)

From these 2 models, we can see that False Positive of left side model is greater than False Positive of right side model. And where as False Negative of left model is less than false negative of right side model.

$$FP_{left} > FP_{right} | FN_{left} < FN_{right}$$

So, right side model is more accurate. **What proportion of predicted Positives is truly positive?**

$$Precision = \frac{TP}{TP + FP}$$

So, precision of left side model = $\frac{100}{100 + 30}$ and precision of right side model = $\frac{100}{100 + 10}$. So $Precision_{left} < Precision_{right}$

# Recall

![Recall](./images/image-9.png)

According to the problem in this case, left side model is better than right side model.

$$Recall = \frac{TP}{TP + FN}$$

So, recoall of left side model = $\frac{1000}{1200}$ and recall of right side model = $\frac{1000}{1500}$. So $R_{left} > R_{right}$

# F1 Score

**If you can't decide that is type-I error more dangerous or type-II more dangerous, this is solved by F1 Score. It is the combination of Precesion and Recall.**

$$F1 = \frac{2*Precision*Recall}{Precision + Recall}$$

In [None]:
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score

In [None]:
# precision score
precision_score(y_test, y_pred)

In [None]:
# recall score
recall_score(y_test, y_pred)

In [None]:
# f1 score
f1_score(y_test, y_pred)

# Multi-Class Precision & Recall

| | Dog | Cat | Rabbit | Total |
| :--: | :--: | :--: | :--: | :--: |
| **Dog** | 25 | 5 | 10 | 40 |
| **Cat** | 0 | 30 | 4 | 34 |
| **Rabbit** | 4 | 10 | 20 | 34 |
| **Total** | 29 | 45 | 34 |  |

$Precision_{dog} = \frac{25}{29} = 0.86$

$Precision_{cat} = \frac{30}{45} = 0.66$

$Precision_{rabbit} = \frac{20}{34} = 0.58$

$\text{Macro Precision} = \frac{0.86 + 0.66 + 0.58}{3} = 0.7$

$\text{Weighted Precision} = \frac{40}{108}*0.86 + \frac{34}{108} * 0.64 + \frac{34}{108}*0.58 = 0.71$

-------------------------------------------------------------------------
$Recall_{dog} = \frac{25}{40} = 0.62$

$Recall_{cat} = \frac{30}{34} = 0.88$

$Recall_{rabbit} = \frac{20}{34} = 0.58$

$\text{Macro Recall} = \frac{0.62 + 0.88 + 0.58}{3} = 0.69$

$\text{Weighted Precision} = \frac{40}{108} * 0.62 + \frac{34}{108} * 0.88 + \frac{34}{108} * 0.58 = 0.68$

--------------------------------------------------------------------------
$F1_{dog} = \frac{2P_DR_D}{P_D + R_D} = \frac{2*0.86*0.62}{0.86 + 0.62} = 0.72$

$F1_{cat} = \frac{2P_CR_C}{P_C + R_C} = \frac{2*0.66*0.88}{0.66 + 0.88} = 0.75$

$F1_{Rabbit} = \frac{2P_RR_R}{P_R + R_R} = \frac{2*0.58*0.58}{0.58 + 0.58} = 0.58$

$\text{Macro F1} = \frac{0.72 + 0.75 + 0.58}{3} = 0.68$

$\text{Weighted F1} = \frac{40}{108}*0.72 + \frac{34}{108}*0.75 + \frac{34}{108}*0.58 = 0.69$

In [None]:
from sklearn.metrics import precision_score, recall_score, f1_score

precision_score(y_test, y_pred, average = None)
recall_score(y_test, y_pred, average = None)
f1_score(y_test, y_pred, average = None)

In [None]:
precision_score(y_test, y_pred, average = "macro")
recall_score(y_test, y_pred, average = "macro")
f1_score(y_test, y_pred, average = "macro")

In [None]:
precision_score(y_test, y_pred, average = "weighted")
recall_score(y_test, y_pred, average = "weighted")
f1_score(y_test, y_pred, average = "weighted")

In [None]:
from sklearn.metrics import classification_report

# all calculation together at a time for precision, recall, f1 score, macro, weighted and accuracy
classification_report(y_test, y_pred)