# Awesome Classification Metrics

This notebook is some prebuilt things you need to validate classification models

I will pepper in comments as to what to look for and references to help understand

## Dependencies

In [None]:
!pip install scikit-learn==1.0.2

In [None]:
import itertools
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import *
from sklearn.calibration import calibration_curve, CalibrationDisplay
from sklearn.dummy import DummyClassifier


## Dummy Predictions

You need this to create dummy/no skill predictions to compare your model against.

In [None]:
def dummy_preds(x_train, y_train, x_test):
    """
    this function creates a dummy model that has no skill but guesses probs and predictions and returns such for use later
    input x_train, y_train and x_test pandas dataframes or numpy arrays
    output preds and probs arrays showing predictions and probabilities for dummy classifier
    """
    model = DummyClassifier(strategy = 'stratified')
    model.fit(x_train, y_train)
    preds = model.predict(x_test)
    probs = model.predict_proba(x_test)
    
    return preds, probs

## Confusion Matrix

good reliable confusion matrix to see how the model performs

this shows both count based and normalized confusion matrices

In [None]:
def plot_confusion_matrix(y_df, preds):
    """
    plots confusion matrix in both normalized and count formats
    
    """
    f, axes = plt.subplots(1, 2, figsize=(20, 5), sharey='row')

    for i in range(2):
        cf_matrix = confusion_matrix(y_df, preds)
        if i == 0:
            title = "Normalized"
            cf_matrix = cf_matrix.astype('float') / cf_matrix.sum(axis = 1)[:,np.newaxis]
        else:
            title = "Count Based"
        
        disp = ConfusionMatrixDisplay(cf_matrix)
        disp.plot(ax=axes[i])
        disp.ax_.set_title(title)
        disp.ax_.set_xlabel('Predicted Label')
        if i!=0:
            disp.ax_.set_ylabel('')

    plt.subplots_adjust(wspace=0.40, hspace=0.1)

    plt.show()

## Standard Metrics

Run some standard metrics that you may need to run and also a classification report at the end.

notes:
- average precision score
    - summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold
- log loss
    - indicative of how close the prediction probability is to the corresponding actual/true value (0 or 1 in case of binary classification)
- precision score
    - useful in situations where you want to capture true positives but dont care about false negatives or missed cases
    - you are picking books in a library and want to make sure the books you pick are awesome and dont want any duds but also dont care if you miss some awesome books
- recall score
    - useful in situations where you want to maximize true positives and dont care about false positives
    - you are robbing a house and grab as much jewelery as you can and dot care about gettting some party jewelery as long as you got all the good stuff you can sell
- f1 score
    - a balance btw precision and recall basically
- classification report
    - shows various metrics that are useful to see

In [None]:
def standard_metrics(y_df, probs, preds):
    """
    prints some classiffication metrics that are useful in model validation
    """
    print("Average Precision Score: {:.4}".format(average_precision_score(y_df, preds)))
    print("Log Loss: {:.4}".format(log_loss(y_df, probs[::,1])))    
    print("Precision Score: {:.4}".format(precision_score(y_df, preds)))
    print("Recall Score: {:.4}".format(recall_score(y_df, preds)))
    print("F1 Score: {:.4}".format(f1_score(y_df, preds)))
    print("Classification Report: \n", classification_report(y_df, preds))


## Auc Roc

This shows how adjusting the probability threshold balances between trus positive and false positives

We reuse the dummy predictions here labeled as input yhat to help plot how a no skill classifier compares to the trained model

In [None]:
def plot_auc_roc(y_df, probs, yhat):
    """
    plots auc roc for model and no skill classifier
    """
    
    fpr, tpr, _ = roc_curve(y_df, probs[::,1])
    n_fpr, n_tpr, _ = roc_curve(y_df, yhat[::,1])
    
    plt.plot(fpr, tpr, marker = '.', label = "Trained Model")
    plt.plot(n_fpr, n_tpr, ':r', label = "No Skill")
    
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    
    trained_auc = "\nTrained Model AUC ROC: {:4f}".format(roc_auc_score(y_df, probs[::,1]))
    no_skill_auc = "\nNo Skill AUC ROC: {:4f}".format(roc_auc_score(y_df, yhat[::,1]))
    plt.title("Auc Roc Plot and Scores:" + trained_auc + no_skill_auc)
    
    plt.legend()
    
    plt.show()


## Precision Recall Curve

Shows the plot of the precision recall metrics as well as the no skill classifier and also shows the area under the curve

In [None]:
from sklearn.metrics import *

def plot_precision_recall(y_df, probs, yhat):
    """
    plots precision recall for model and no skill classifier
    """
    
    precision, recall, _ = precision_recall_curve(y_df, probs[::,1])
    n_precision, n_recall, _ = precision_recall_curve(y_df, yhat[::,1])
    
    plt.plot(recall, precision, marker = '.', label = "Trained Model")
    plt.plot(n_recall, n_precision, ':r', label = "No Skill")
    
    plt.xlabel('Recall')
    plt.ylabel('Precision')
    
    trained_auc = "\nTrained Model AUC: {:4f}".format(auc(recall, precision))
    no_skill_auc = "\nNo Skill AUC: {:4f}".format(auc(n_recall, n_precision))
    plt.title("Precision Recall Plot and Scores:" + trained_auc + no_skill_auc)
    
    plt.legend()
    
    plt.show()

## Calibration Plots

These show how the model is performing in terms of calibration
- Below the diagonal: The model has over-forecast; the probabilities are too large.
- Above the diagonal: The model has under-forecast; the probabilities are too small.


In [None]:
def plot calibration(y_df, probs, yhat)
    """
    prints calibration plots and brier scores for model v no skill classifier
    """
    trained_brier = "n\Trained Model Brier Score: {:.4f}".format(brier_score(y_df, probs[::,1]))
    no_skill_brier = "n\No Skill Brier Score: {:.4f}".format(brier_score(y_df, yhat[::,1]))
    
    fop, mpv = calibration_curve(y_df, probs[::,1], n_bins=10, normalize=True)
    n_fop, n_mpv = calibration_curve(y_df, yhat[::,1], n_bins=10, normalize=True)
    
    plt.plot([0, 1], [0, 1], linestyle='--', label = 'Prefect Calibration')
    plt.plot(mpv, fop, marker='.', label = "Trained Model")
    plt.plot(n_mpv, n_fop, marker='_', label = "No Skill")
    
    plt.title("Calibration Plots and Brier Scores:" + trained_brier + no_skill_brier)
    
    plt.legend()
    plt.show()
    