# Classification Metrics

In this notebook, we will go through the most common classification metrics. This is a companion workbook for the 365 Data Science course on ML Process. This notebook only foucses on implementation. Check out the course or the documentation for the in-depth explanations of each approach.

We will cover:

- Accuracy, Precision, Recall
- F1-Score
- ROC-AUC
- PR-AUC
- Log Loss

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

## The Dataset

This dataset is a [marketing analytics dataset](https://www.kaggle.com/datasets/jackdaoud/marketing-data) around customer food preferences. Imagine you're a food company attempting to grow food purchases using a marketing campaign. 

In this dataset, we're going to build a propensity model. Propensity Models typically tell us a customer's propensity to make a purchase. Propensity models are useful for marketing teams to understand which users they should target. Then, they could save marketing $$ by making more targeted offers.

## Load Data

First, we'll load our dataset.

In [3]:
df = pd.read_csv("ifood_df.csv")

We won't walk through each column in the dataset, but the dataset will contain variables describing the customer (education level, marital status etc..) and their past purchasing behavior (MntFish Products = amount spend on fish products). 

In [4]:
df.head()

Unnamed: 0,Income,Kidhome,Teenhome,Recency,MntWines,MntFruits,MntMeatProducts,MntFishProducts,MntSweetProducts,MntGoldProds,...,marital_Together,marital_Widow,education_2n Cycle,education_Basic,education_Graduation,education_Master,education_PhD,MntTotal,MntRegularProds,AcceptedCmpOverall
0,58138.0,0,0,58,635,88,546,172,88,88,...,0,0,0,0,1,0,0,1529,1441,0
1,46344.0,1,1,38,11,1,6,2,1,6,...,0,0,0,0,1,0,0,21,15,0
2,71613.0,0,0,26,426,49,127,111,21,42,...,1,0,0,0,1,0,0,734,692,0
3,26646.0,1,0,26,11,4,20,10,3,5,...,1,0,0,0,1,0,0,48,43,0
4,58293.0,1,0,94,173,43,118,46,27,15,...,0,0,0,0,0,0,1,407,392,0


## Data Processing

To build a propensity model, we're going to try and predict whether a user *accepted any offer at all*. In order to make this prediction. we'll need to do some pre-processing:

In [5]:
cols = ['AcceptedCmp1',
       'AcceptedCmp2',
       'AcceptedCmp3',
       'AcceptedCmp4',
       'AcceptedCmp5',
       'AcceptedCmp']

## Create the target variable, if they accepted any campaign at all
df['AcceptedCmp'] = df['AcceptedCmp1'] + df['AcceptedCmp2'] + df['AcceptedCmp3'] + df['AcceptedCmp4'] + df['AcceptedCmp5']+ df['Response']

df['AcceptedCmp'] = np.where(df['AcceptedCmp'] > 0, 1, 0)

In [6]:
df[df['AcceptedCmp'] == 0]

Unnamed: 0,Income,Kidhome,Teenhome,Recency,MntWines,MntFruits,MntMeatProducts,MntFishProducts,MntSweetProducts,MntGoldProds,...,marital_Widow,education_2n Cycle,education_Basic,education_Graduation,education_Master,education_PhD,MntTotal,MntRegularProds,AcceptedCmpOverall,AcceptedCmp
1,46344.0,1,1,38,11,1,6,2,1,6,...,0,0,0,1,0,0,21,15,0,0
2,71613.0,0,0,26,426,49,127,111,21,42,...,0,0,0,1,0,0,734,692,0,0
3,26646.0,1,0,26,11,4,20,10,3,5,...,0,0,0,1,0,0,48,43,0,0
4,58293.0,1,0,94,173,43,118,46,27,15,...,0,0,0,0,0,1,407,392,0,0
5,62513.0,0,1,16,520,42,98,0,42,14,...,0,0,0,0,1,0,702,688,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2197,44802.0,0,0,71,853,10,143,13,10,20,...,0,0,0,0,1,0,1029,1009,0,0
2198,26816.0,0,0,50,5,1,6,3,4,3,...,0,0,0,1,0,0,19,16,0,0
2199,34421.0,1,0,81,3,3,7,6,2,9,...,0,0,0,1,0,0,21,12,0,0
2200,61223.0,0,1,46,709,43,182,42,118,247,...,0,0,0,1,0,0,1094,847,0,0


## Cross Validation

We'll do a simple train-test-split for cross validation:

In [7]:
from sklearn.model_selection import train_test_split

y = df['AcceptedCmp']
X = df.drop(cols,axis=1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

## Model

Since the dataset is small, logistic regression would be the model of choice:

In [8]:
from sklearn.linear_model import LogisticRegression

rf = LogisticRegression()

rf.fit(X_train, y_train)
y_preds = rf.predict(X_test)

## Evaluation

### Accuracy, Precision, Recall

The simplest metrics are Accuracy/Precision/Recall. Here are the formulas:

`Accuracy`: (True Positives + True Negatives)/Total Predictions

`Precision`: True Positives/(True Positives + False Positives) 

`Recall`: True Positives/(True Positives + False Negatives)

In [9]:
from sklearn.metrics import (
    accuracy_score,
    precision_score,
    recall_score,
    f1_score
)


accuracy = accuracy_score(y_test, y_preds)
precision = precision_score(y_test, y_preds)
recall = recall_score(y_test, y_preds)


print("Accuracy: {0}".format(accuracy))
print("Precision: {0}".format(precision))
print("Recall: {0}".format(recall))


Accuracy: 0.7857142857142857
Precision: 0.6239316239316239
Recall: 0.3945945945945946


## F1-Score

F1-Score takes the harmonic mean of Precision and Recall. This is a good metric to use for imbalanced datasets. The formula looks like this: 

`F1-Score` = (2 x Precision x Recall)/(Precsion + Recall)

In [10]:
f1 = f1_score(y_test, y_preds)
print("F1: {0}".format(f1))

F1: 0.48344370860927155


### ROC-AUC

ROC-AUC evaluates the tradeoff between the True Positive Rate and the False Positive Rate. To review, the formulas are: 

`True Positive Rate` = True Positives/Actual Positives

`False Positive Rate` = False Positives/Actual Negatives

It tells us how well we're able to separate two classes in our predictions:

In [11]:
from sklearn.metrics import (
    roc_auc_score,
    average_precision_score
)


roc_auc = roc_auc_score(y_test, y_preds)

print("ROC-AUC: {0}".format(roc_auc))

ROC-AUC: 0.656781643521975


### PR-AUC

PR-AUC behaves similar to ROC-AUC, except we're evaluating the tradeoff between Precision and Recall, hence (PR - AUC):

In [12]:
from sklearn.metrics import (
    average_precision_score
)

pr_auc = average_precision_score(y_test, y_preds)

print("PR-AUC: {0}".format(pr_auc))

PR-AUC: 0.40004620004620006


### Log Loss

The intuition behind log loss is that we’re measuring how close our predicted probability is to the true value. Log Loss will penalize errors exponentially:

In [32]:
from sklearn.metrics import (
    log_loss
)

log_loss = log_loss(y_test, y_preds)

print("Log Loss: {0}".format(log_loss))

Log Loss: 0.9963174476342117


## Conclusion

In review, we went over a variety of regression metrics:

- Accuracy, Precision, Recall
- F1-Score
- ROC-AUC
- PR-UC
- Log Loss