# Linear Error Propagation




#### MMU Installation

`mmu` can be installed using:
```bash
pip install git+https://github.com/RUrlus/ModelMetricUncertainty.git
```

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import arviz as az
import numpy as np
import pandas as pd
import seaborn as sns

import mmu
import mmur
from mmur.viz import _set_plot_style, plot_logstic_dgp, plot_probas, plot_ci_violin

In [None]:
%matplotlib inline
COLORS = _set_plot_style()

In [None]:
def plot_metric_distributions(estimated_metrics, gt_metrics, coverage=None, label_alt='simulated'):
    fig, axs = plt.subplots(ncols=5, figsize=(25, 5))
    for i, c in enumerate(estimated_metrics.columns):
        sns.kdeplot(estimated_metrics[c], ax=axs[i], label='estimated')
        if coverage is not None:
            sns.kdeplot(coverage[c], ax=axs[i], label=label_alt)
        axs[i].axvline(gt_metrics[c][0], c='grey', lw=2, ls='--', label='population mean')
    axs[0].legend()
    return fig, axs

### Logistic process

Let $X \subset \mathbb{R}^{Nx2}$
where:

$$\begin{align}
X_{i, 1} &= 1.~\forall~i \in N\\
X_{i, 2} &\sim \mathrm{Uniform}(-10, 10)~\forall~i \in N\\
L &= \beta_{1}X_{1} + \beta_{2}X_{2}\\
P &= \mathrm{sigmoid}(L)\\
L_{\mathrm{noisy}} &= L + \mathrm{Normal}(0, \sigma)\\
P_{\mathrm{noisy}} &= \mathrm{sigmoid}(L_{\mathrm{noisy}})\\
y &\sim \mathrm{Bernoulli}(P_{\mathrm{noisy}})\\
\end{align}$$

In [None]:
fig, ax = plot_logstic_dgp()

## Logistic Model generator

We simulate a scenario where you have trained a Logistic Regression model and we want to see how well the uncertainty of the metrics is modelled

1. Generate train, test and holdout samples from logistic process
2. Fit Logistic regression on train set
3. Using fitted model predict probabilities on:
    a. test set
    b. all hold out sets
4. Compute confusion matrix test
5. Model uncertainty on the metrics based on the test set
6. Compare against distribution of metrics on the hold out sets

### Generate data

In [None]:
generator = mmur.LogisticGenerator()
outp = generator.fit_transform(
    train_samples=10000,
    test_samples=10000,
    holdout_samples=10000,
    n_sets=10000,
    noise_sigma=0.3,
    random_state=123456
)

# Select the test sets
y_test = outp['test']['y']
probas_test = outp['test']['proba']

We only consider the below metrics for now

In [None]:
target_metrics = [
    'pos.precision', 'pos.recall'
]

#### Test set performance

Compute the confusion matrix and metrics on the test-set

In [None]:
test_conf_mat, test_metrics = mmu.binary_metrics_proba(
    y_test, probas_test, threshold=0.5
)
test_conf_mat = test_conf_mat.flatten()
test_metrics = mmu.metrics_to_dataframe(test_metrics)[target_metrics]

mmu.confusion_matrix_to_dataframe(test_conf_mat)

In [None]:
test_metrics

In [None]:
gt_proba_test = outp['ground_truth']['test']

We know the ground truth probability

Compute the ground truth confusion matrix and metrics

In [None]:
gt_conf_mat, gt_metrics = mmu.binary_metrics_proba(
    y_test, gt_proba_test, threshold=0.5
)
mmu.confusion_matrix_to_dataframe(gt_conf_mat)

In [None]:
gt_metrics = mmu.metrics_to_dataframe(gt_metrics)[target_metrics]
gt_metrics

### Hold-out set

We compare a sample from the holdout set to the ground truth probability

In [None]:
y_holdout = outp['holdout']['y']
proba_holdout = outp['holdout']['proba']

In [None]:
fig, ax = plot_probas(proba_holdout, gt_proba_test)

Compute metrics on this set

In [None]:
holdout_conf_mat, holdout_metrics = mmu.binary_metrics_runs(y=y_holdout, proba=proba_holdout, threshold=0.5)
holdout_metrics = mmu.metrics_to_dataframe(holdout_metrics)[target_metrics]

At this point you can compare the observed metrics in `holdout_metrics` with the estimation coming from the method.

Univariate uncertainties can be validated using the below cell as an example

In [None]:
from mmur.models import pr_uni_err_prop

In [None]:
lep_ci = pd.DataFrame(
    np.vstack(pr_uni_err_prop(test_conf_mat)),
    columns=['mu', 'sigma', 'lb', 'ub'],
    index=['pos.precision', 'pos.recall']
)

In [None]:
fig, ax = plot_ci_violin(lep_ci, holdout_metrics)