# Performance Metrics

---

This notebook is part of https://github.com/risc-mi/catabra.

This short example demonstrates how to change the hyperparameter training objective and the metrics reported during training. We focus on binary classification here, but everything applies equally to multiclass- and multilabel classification, and regression.

Familiarity with CaTabRa's main data analysis workflow is assumed. A step-by-step introduction can be found in [Workflow.ipynb](https://github.com/risc-mi/catabra/examples/Workflow.ipynb).

## Inspect Default Metrics

For each of the prediction tasks supported by CaTabRa, a default metric is optimized during hyperparameter tuning. In the case of binary classification this is ROC-AUC, the area under the [Receiver Operating Characteristic](https://en.wikipedia.org/wiki/Receiver_operating_characteristic) curve, as can be seen when inspecting `catabra.core.config.DEFAULT_CONFIG`:

In [2]:
from catabra.core import config
config.DEFAULT_CONFIG

{'automl': 'auto-sklearn',
 'ensemble_size': 10,
 'ensemble_nbest': 10,
 'memory_limit': 3072,
 'time_limit': 1,
 'jobs': 1,
 'copy_analysis_data': False,
 'copy_evaluation_data': False,
 'static_plots': True,
 'interactive_plots': False,
 'bootstrapping_repetitions': 0,
 'explainer': 'shap',
 'binary_classification_metrics': ['roc_auc', 'accuracy', 'balanced_accuracy'],
 'multiclass_classification_metrics': ['accuracy', 'balanced_accuracy'],
 'multilabel_classification_metrics': ['f1_macro'],
 'regression_metrics': ['r2', 'mean_absolute_error', 'mean_squared_error'],
 'ood_class': 'autoencoder',
 'ood_source': 'internal',
 'ood_kwargs': {},
 'auto-sklearn_include': None,
 'auto-sklearn_exclude': None,
 'auto-sklearn_resampling_strategy': None,
 'auto-sklearn_resampling_strategy_arguments': None}

The binary classification metrics are listed under `"binary_classification_metrics"`. The first entry in the list is the hyperparameter optimization objective, the remaining entries are additional metrics reported during model training. Likewise, `"multiclass_classification_metrics"`, `"multilabel_classification_metrics"` and `"regression_metrics"` contain the same information for the other prediction tasks.

**NOTE**<br>
For more information about the possible config parameters and their meaning, please refer to [config.md](https://github.com/risc-mi/catabra/doc/config.md).

## Change Metrics

Changing the optimization objective and/or list of metrics reported during model training is easy: simply update the config dict when calling `catabra.analysis.analyze()`, as demonstrated below.

In [3]:
# load dataset
from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(as_frame=True, return_X_y=True)

In [4]:
# add target labels to DataFrame
X['diagnosis'] = y

In [5]:
# split into train- and test set by adding column with corresponding values
# the name of the column is arbitrary; CaTabRa tries to "guess" which samples belong to which set based on the column name and -values
X['train'] = X.index <= 0.8 * len(X)

Keyword argument `config` of function `analyze()` allows to update the default config dict. In this example, we use it to specify different binary classification metrics. The value passed to `config` can be either a dict, or the path to a JSON file containing such a dict. The latter is especially useful on the command line.

In [6]:
from catabra.analysis import analyze

analyze(
    X,
    classify='diagnosis',     # name of column containing classification target
    split='train',            # name of column containing information about the train-test split (optional)
    time=1,                   # time budget for hyperparameter tuning, in minutes (optional)
    out='performance_metrics',
    config={
        'binary_classification_metrics': ['f1', 'sensitivity', 'specificity']
    }
)

[CaTabRa] ### Analysis started at 2023-02-07 12:50:54.424329
[CaTabRa] Saving descriptive statistics completed
[CaTabRa] Using AutoML-backend auto-sklearn for binary_classification
[CaTabRa] Successfully loaded the following auto-sklearn add-on module(s): xgb


  self.metafeatures = self.metafeatures.append(metafeatures)
  self.algorithm_runs[metric].append(runs)


[CaTabRa] New ensemble fitted:
    ensemble_val_f1: 0.937143
    n_constituent_models: 1
    total_elapsed_time: 00:04
[CaTabRa] New model #1 trained:
    val_f1: 0.937143
    val_sensitivity: 0.921348
    val_specificity: 0.935484
    train_f1: 1.000000
    type: random_forest
    total_elapsed_time: 00:04
[CaTabRa] New ensemble fitted:
    ensemble_val_f1: 0.966667
    n_constituent_models: 2
    total_elapsed_time: 00:05
[CaTabRa] New model #2 trained:
    val_f1: 0.961326
    val_sensitivity: 0.977528
    val_specificity: 0.919355
    train_f1: 0.983607
    type: mlp
    total_elapsed_time: 00:05
[CaTabRa] New ensemble fitted:
    ensemble_val_f1: 0.961326
    n_constituent_models: 2
    total_elapsed_time: 00:07
[CaTabRa] New model #3 trained:
    val_f1: 0.937143
    val_sensitivity: 0.921348
    val_specificity: 0.935484
    train_f1: 0.989011
    type: random_forest
    total_elapsed_time: 00:07
[CaTabRa] New ensemble fitted:
    ensemble_val_f1: 0.961326
    n_constituent_mode

[CaTabRa] New model #32 trained:
    val_f1: 0.741667
    val_sensitivity: 1.000000
    val_specificity: 0.000000
    train_f1: 0.744856
    type: mlp
    total_elapsed_time: 00:50
[CaTabRa] Final training statistics:
    n_models_trained: 32
    ensemble_val_f1: 0.9888888888888888
[CaTabRa] Creating shap explainer
[CaTabRa] Initialized out-of-distribution detector of type Autoencoder
[CaTabRa] Fitting out-of-distribution detector...
Iteration 1, loss = 0.06674697
Iteration 2, loss = 0.03886039
Iteration 3, loss = 0.02630481
Iteration 4, loss = 0.01931956
Iteration 5, loss = 0.01464805
Iteration 6, loss = 0.01285085
Iteration 7, loss = 0.01249570
Iteration 8, loss = 0.01238648
Iteration 9, loss = 0.01221173
Iteration 10, loss = 0.01181073
Iteration 11, loss = 0.01156576
Iteration 12, loss = 0.01151248
Iteration 13, loss = 0.01146056
Iteration 14, loss = 0.01140084
Iteration 15, loss = 0.01138180
Iteration 16, loss = 0.01134451
Iteration 17, loss = 0.01131035
Iteration 18, loss = 0.0113

Note that the F1-score, sensitivity and specificity are now reported during model training. The F1-score is the hyperparameter optimization objective.

**NOTE**<br>
Regardless of the metrics specified in the config dict, evaluating a model with function `catabra.evaluation.evaluate()` always reports *all* suitable built-in performance metrics in `metrics.xlsx`.

## Available Metrics

Check out [metrics.md](https://github.com/risc-mi/catabra/doc/metrics.md) for an overview of all built-in metrics available in CaTabRa.