## ML model evaluation metrics (Accuracy, Precision, Recall, Specificity, F1 Score and Negative Predictive Value) for a fintech case

What are precision and recall?

What error metric would you use to evaluate how good a binary classifier is?

How do you find the accuracy of a confusion matrix?

What is F1 score in confusion matrix?

What does 1% accuracy mean?

What is the difference between precision and accuracy?

In what cases you should not use accuracy as the main metric?

What is Specificity?

When should you use precision, and recall?

What is a Negative predictive value?

### Confusion matrix

The confusion matrix helps us visualize whether the model is “mistaken” in distinguishing between two classes. As you can see in the below picture, it is a 2x2 matrix. The row names are the actuals from the test set, and the column names are the ones predicted by the model.

Based on them, you can calculate metrics that provide additional information about your model:

Accuracy;

Precision;

Recall (Sensitivity);

F1-score;

Specificity;

Negative Predictive Value.

### Accuracy 

accuracy = number of corrected predictions / total number of predictions

accuracy = True Negative + True Position / True Negative + True Positive + False Negative + False Negative

In [None]:
from sklearn.metrics import accuracy_score

score = accuracy_score(y_test, y_pred)

The accuracy is misleading in case of imbalanced data.

### Precision

Precision = TP / TP + FP

Precision measuures the accuracy of the model for the predicted positive class. 

It reflects how reliable a model makes predictions for the positive class.

In [None]:
from sklearn.metrics import precision_score

score = precision_score(y_test,y_pred)

### Recall (Sensitivity)

Recall (Sensitivity) = TP / TP + FN

Recall only cares about how actual positive samples are classified.

This metrics does not depend on how negative samples are classified, unlike precision.

In [None]:
from sklearn.metrics import recall_score

score = recall_score(y_test,y_pred)

## When to use Precision and Recall?

If you goal is to find all positive samples, and you do not care if negative samples are classified positive, use Recall.

If the negative class is a concern, then use precision.

## For Fintech Case

Erroneously predicting customers who will not pay the loan back will severely damage our fintech company. 

Yes, we can focus on approving as many candidates as possible, which will increase our revenues, however, one customer that fails to pay the loan back is much more expensive for us than ten customers paying the loan. Therefore, we optimize our ML model based on recall, since we want to find all positive samples.

### F1-score

F1 - score = 2 * Precision  * Recall / Precision + Recall

In [None]:
from sklearn.metrics import f1_score

score = f1_score(y_test,y_pred)

### Specificity

specificity = TN / TN + FP

Specificity focuses on actual negative class, similarly to recall focus on actual positive class.

 If the negative class is more important for you, then you should probably relook at what problem your ML model solves, and swap positive and negative classes, if needed. 
    
    For example, in the fintech case, you may label “customer is likely to pay the loan back” as a positive class rather than “customer that default or charge off”. Given it is an imbalanced dataset where most of the customers are paying the loan back, you will naturally get high recall and precision, and extremely low specificity.

### Negative Predictive Value

Negative Predictive Value = TN / TN + FN

In [None]:
from sklearn.metrics import confusion_matrix

tn,fp,fn,tp = confusion_matrix(y_test,y_pred)

NPV = tn / tn + fn

![Screenshot%202022-09-18%20at%209.26.33%20AM.png](attachment:Screenshot%202022-09-18%20at%209.26.33%20AM.png)