# MMU walkthrough

This notebook briefly demonstrates the various capabilities of the package

In [1]:
import pandas as pd
import numpy as np
import mmu

## Model generation

We use the model generator to create a dataset of classifiers.

We turn of the model noise as this is not needed at this point.
Additionally, we turn of random train/test splits for each model such
that each model has seen the same data

In [2]:
model = mmu.ModelGenerator(random_state=112356)
fit = model.fit()
train_mask, y, probas, X, models, ground_truth = fit.transform(
    enable_model_noise=False,
    enable_sample_noise=False,
)

Create test sets on which we will compute the metrics

Note that all the functions require contiguous arrays so slicing by column requires F-order

In [3]:
y_test = np.asarray(y[~train_mask, :], order='F')
probas_test = np.asarray(probas[~train_mask, :], order='F')
yhat_test = np.asarray(np.rint(probas_test).astype(np.int64, copy=False), order='F')

### Confusion matrix only

We can compute the confusion matrix for a single run using yhat or based on the probability and a classification threshold

In [4]:
# based on yhat
mmu.confusion_matrix(y_test[:, 0], yhat_test[:, 0])

array([[1438,   43],
       [ 177, 1342]])

In [5]:
# based on proba with 0.5 classification threshold
mmu.confusion_matrix_proba(y_test[:, 0], probas_test[:, 0], threshold=0.5)

array([[1438,   43],
       [ 177, 1342]])

### Confusion matrix and metrics

The ``binary_metrics*`` functions computes ten classification metrics:
 *    0 - neg.precision aka Negative Predictive Value
 *    1 - pos.precision aka Positive Predictive Value
 *    2 - neg.recall aka True Negative Rate & Specificity
 *    3 - pos.recall aka True Positive Rate aka Sensitivity
 *    4 - neg.f1 score
 *    5 - pos.f1 score
 *    6 - False Positive Rate
 *    7 - False Negative Rate
 *    8 - Accuracy
 *    9 - MCC
 
This index can be retrieved using:

In [6]:
col_index = mmu.metrics.col_index
col_index

{'neg.precision': 0,
 'npv': 0,
 'pos.precision': 1,
 'ppv': 1,
 'neg.recall': 2,
 'tnr': 2,
 'specificity': 2,
 'pos.recall': 3,
 'tpr': 3,
 'sensitivity': 3,
 'neg.f1': 4,
 'neg.f1_score': 4,
 'pos.f1': 5,
 'pos.f1_score': 5,
 'fpr': 6,
 'fnr': 7,
 'accuracy': 8,
 'acc': 8,
 'mcc': 9}

### Confusion matrix and binary metrics over a single run

In [7]:
cm, metrics = mmu.binary_metrics(y_test[:, 0], yhat_test[:, 0])

In [8]:
# the confusion matrix
cm

array([[1438,   43],
       [ 177, 1342]])

We can create a dataframe from the confusion matrix using:

In [9]:
mmu.confusion_matrix_to_dataframe(cm)

Unnamed: 0_level_0,Unnamed: 1_level_0,estimated,estimated
Unnamed: 0_level_1,Unnamed: 1_level_1,negative,positive
observed,negative,1438,43
observed,positive,177,1342


In [10]:
# the metrics
metrics

array([0.89040248, 0.96895307, 0.97096556, 0.88347597, 0.92894057,
       0.92424242, 0.02903444, 0.11652403, 0.92666667, 0.85689502])

We can create a dataframe from the metrics using:

In [11]:
mmu.metrics_to_dataframe(metrics)

Unnamed: 0,neg.precision,pos.precision,neg.recall,pos.recall,neg.f1,pos.f1,fpr,fnr,acc,mcc
0,0.890402,0.968953,0.970966,0.883476,0.928941,0.924242,0.029034,0.116524,0.926667,0.856895


### Confusion matrix and binary metrics over a single run using probabilities

In [12]:
cm, metrics = mmu.binary_metrics_proba(y_test[:, 0], probas_test[:, 0], threshold=0.5)

In [13]:
mmu.confusion_matrix_to_dataframe(cm)

Unnamed: 0_level_0,Unnamed: 1_level_0,estimated,estimated
Unnamed: 0_level_1,Unnamed: 1_level_1,negative,positive
observed,negative,1438,43
observed,positive,177,1342


In [14]:
mmu.metrics_to_dataframe(metrics)

Unnamed: 0,neg.precision,pos.precision,neg.recall,pos.recall,neg.f1,pos.f1,fpr,fnr,acc,mcc
0,0.890402,0.968953,0.970966,0.883476,0.928941,0.924242,0.029034,0.116524,0.926667,0.856895


### Confusion matrix and binary metrics over a single run using multiple thresholds

In [15]:
thresholds = np.linspace(1e-5, 1.0, 1000)

In [16]:
cm, metrics = mmu.binary_metrics_thresholds(
    y=y_test[:, 0],
    proba=probas_test[:, 0],
    thresholds=thresholds,
    fill=1.0
)

In [17]:
cm

array([[   0, 1481,    0, 1519],
       [  55, 1426,    0, 1519],
       [ 142, 1339,    1, 1518],
       ...,
       [1479,    2, 1097,  422],
       [1480,    1, 1194,  325],
       [1481,    0, 1519,    0]])

In [18]:
mmu.metrics_to_dataframe(metrics)

Unnamed: 0,neg.precision,pos.precision,neg.recall,pos.recall,neg.f1,pos.f1,fpr,fnr,acc,mcc
0,1.000000,0.506333,0.000000,1.000000,0.000000,0.672273,1.000000,0.000000,0.506333,1.000000
1,1.000000,0.515789,0.037137,1.000000,0.071615,0.680556,0.962863,0.000000,0.524667,0.138401
2,0.993007,0.531327,0.095881,0.999342,0.174877,0.693784,0.904119,0.000658,0.553333,0.223447
3,0.995745,0.549005,0.158001,0.999342,0.272727,0.708683,0.841999,0.000658,0.584000,0.292767
4,0.993569,0.564150,0.208643,0.998683,0.344866,0.721008,0.791357,0.001317,0.608667,0.340044
...,...,...,...,...,...,...,...,...,...,...
995,0.597492,0.992424,0.997299,0.344964,0.747281,0.511969,0.002701,0.655036,0.667000,0.449340
996,0.589008,0.995910,0.998650,0.320606,0.740982,0.485060,0.001350,0.679394,0.655333,0.432132
997,0.574146,0.995283,0.998650,0.277814,0.729110,0.434380,0.001350,0.722186,0.633667,0.396770
998,0.553478,0.996933,0.999325,0.213957,0.712395,0.352304,0.000675,0.786043,0.601667,0.342626


### Confusion matrix and binary metrics over a multiple runs using multiple thresholds

In [19]:
cm, metrics = mmu.binary_metrics_runs_thresholds(
    y=y_test,
    proba=probas_test,
    thresholds=thresholds,
    fill=1.0
)

The confusion matrix and metrics are now cubes.

For the confusion matrix the:
* row -- over the thresholds
* colomns -- the confusion matrix elements
* slices -- the runs

The stride is such that the biggest stride is over the thresholds for the confusion matrix and over the metrics for the metrics.

The argument being that you will want to model the confusion matrices over the runs
and the metrics individually over the thresholds and runs

In [20]:
print('shape confusion matrix: ', cm.shape)
print('strides confusion matrix: ', cm.strides)

shape confusion matrix:  (1000, 4, 30)
strides confusion matrix:  (960, 8, 32)


In [21]:
print('shape metrics: ', metrics.shape)
print('strides metrics: ', metrics.strides)

shape metrics:  (1000, 10, 30)
strides metrics:  (240, 240000, 8)


### Binary metrics over confusion matrices

In [22]:
# We use binary_metrics_thresholds to create confusion matrices
cm, _ = mmu.binary_metrics_thresholds(
    y=y_test[:, 0],
    proba=probas_test[:, 0],
    thresholds=thresholds,
    fill=0.0
)

In [23]:
metrics = mmu.binary_metrics_confusion(cm, 0.0)

In [24]:
mmu.metrics_to_dataframe(metrics)

Unnamed: 0,neg.precision,pos.precision,neg.recall,pos.recall,neg.f1,pos.f1,fpr,fnr,acc,mcc
0,0.000000,0.506333,0.000000,1.000000,0.000000,0.672273,1.000000,0.000000,0.506333,0.000000
1,1.000000,0.515789,0.037137,1.000000,0.071615,0.680556,0.962863,0.000000,0.524667,0.138401
2,0.993007,0.531327,0.095881,0.999342,0.174877,0.693784,0.904119,0.000658,0.553333,0.223447
3,0.995745,0.549005,0.158001,0.999342,0.272727,0.708683,0.841999,0.000658,0.584000,0.292767
4,0.993569,0.564150,0.208643,0.998683,0.344866,0.721008,0.791357,0.001317,0.608667,0.340044
...,...,...,...,...,...,...,...,...,...,...
995,0.597492,0.992424,0.997299,0.344964,0.747281,0.511969,0.002701,0.655036,0.667000,0.449340
996,0.589008,0.995910,0.998650,0.320606,0.740982,0.485060,0.001350,0.679394,0.655333,0.432132
997,0.574146,0.995283,0.998650,0.277814,0.729110,0.434380,0.001350,0.722186,0.633667,0.396770
998,0.553478,0.996933,0.999325,0.213957,0.712395,0.352304,0.000675,0.786043,0.601667,0.342626


## How to use

We generate 30 models consisting of 10k samples

In [25]:
model = mmu.ModelGenerator(random_state=112356)
fit = model.fit()
train_mask, y, probas, X, models, ground_truth = fit.transform(
    enable_model_noise=False,
    enable_sample_noise=False,
)

Select the test set observations

In [26]:
y_test = np.asarray(y[~train_mask, :], order='F')
probas_test = np.asarray(probas[~train_mask, :], order='F')
yhat_test = np.asarray(np.rint(probas_test).astype(np.int64, copy=False), order='F')

In [27]:
gt_test = ground_truth.loc[~train_mask, :]

Compute the ground truth confusion matrix and metrics

In [28]:
gt_conf_mat, gt_metrics = mmu.binary_metrics_proba(
    gt_test['y'], gt_test['proba'], threshold=0.5
)

In [29]:
target_metrics = [
    'neg.precision', 'pos.precision', 'neg.recall', 'pos.recall', 'mcc'
]

In [30]:
gt_metrics = mmu.metrics_to_dataframe(gt_metrics)[target_metrics]

#### Sample covariance matrix

We take the first model and compute the confusion matrix for this model

In [31]:
sample_model_conf_mat = mmu.confusion_matrix_proba(
    y_test[:, 0], probas_test[:, 0], threshold=0.5
).flatten()

You can now test your method using the sample_model_conf_mat to generate confusion matrices


conf_mat_samples = beta_binomial...

In [32]:
estimated_metrics = mmu.binary_metrics_confusion(conf_mat_samples, fill=0.0)

NameError: name 'conf_mat_samples' is not defined

Compare the CI with the ground truth metrics `gt_metrics`