# Precision and Recall

## Precision

$$
\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}}
$$

Precision means that, among positive predictions, how much were actually positive. 

When the positive is a rare bad thing, and when precision is low, there are too many alerts.

## Recall

$$
\text{Recall} = \frac{\text{TP}}{\text{FN} + \text{TP}}
$$

Recall means that, among actual positive response, how much we were able to predict as positive. 

When the positive is a rare bad thing, and when recall is low, we are missing too many times, and it's damaging our sysmte.

## Problem

Given a confusion matrix, write a function to compute precision and recall. The vertical represents actual positive, negative. The horizontal represents predicted positive and negative.

In [2]:
conf_mat = [
    [121, 9],
    [17, 144]
]


def precision_recall(P):
    # Precision is about prediction
    precision = P[0][0] / (P[0][0] + P[1][0])
    
    # Recall is about actual
    recall = P[0][0] / (P[0][0] + P[0][1])
    
    return precision, recall


print(precision_recall(conf_mat))

(0.8768115942028986, 0.9307692307692308)


## Spam classifier

Your task is to build a spam classifier for emails. What metrics would you use to track accuracy and validity of the model you build?

This is an imbalanced-class binary classification problem. Define the following things.

True positive is the spam prediction to the actual spam data. True negative is the non-spam prediction to the actual non-spam data. False positive is the spam prediction to the actual non-spam data. False negative is the non-spam prediction to the actual spam data. 

Accuracy is the sum of true positive and true negative divided by the number of data. But it's not a good metric for imbalanced-class binary classification problem, because you can get a high accuracy with a model which predicts all the emails to be non-spam.

Use the following.

Precision is the true positive divided by the sum of true positive and false positive. Recall is the true positive divided by the sum of true positive and false negative. F1 score is the harmonic mean of precision and recall, which is useful to get a single score. 

AUC is the area under the curve. The curve is the plot by true positive on X axis and false positive on Y axis for different threshold values. 

F1 score is a good metric for determining the model capability for imbalanced-class binary classificaiton. AUC is a good metric to compare models

## Resource

- [Precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall)