# User Guide Tutorial 09: Benchmarks

TemporAI provides some useful benchmarking tools in `tempor.benchmarks`, these are demonstrated here.

*Skip the below cell if you are not on Google Colab / already have TemporAI installed:*

In [None]:
%pip install temporai

# Or from the repo, for the latest version:
# %pip install git+https://github.com/vanderschaarlab/temporai.git

## Using `tempor.benchmarks.benchmark_models`

The `tempor.benchmarks.benchmark_models` function provides a quick way to benchmark a number of models (plugins) for
a particular task.

It takes a list of models (these may also be a `Pipeline`) and a dataset, and performs cross-validation to
get the mean and standard deviation of the various metrics.

It returns a tuple `(results_readable, results)` as below.

In [1]:
from tempor.benchmarks import benchmark_models
from tempor.data.datasources import SineDataSource
from tempor import plugin_loader

from IPython.display import display

dataset = SineDataSource(random_state=42, no=25).load()

results_readable, results = benchmark_models(
    task_type="prediction.one_off.classification",
    tests=[
        ("model_1", plugin_loader.get("prediction.one_off.classification.nn_classifier", n_iter=10)),
        ("model_2", plugin_loader.get("prediction.one_off.classification.ode_classifier", n_iter=100)),
    ],
    data=dataset,
    n_splits=3,
)

print("Results in easily-readable format:")
display(results_readable)

print("Full results:\n")
for model, value in results.items():
    print(f"{model}:")
    display(value)

2023-10-09 18:03:21 | INFO     | tempor.benchmarks.benchmark:benchmark_models:100 | Test case: model_1
2023-10-09 18:03:23 | INFO     | tempor.models.ts_model:_train:379 | Epoch:0| train loss: 0.68637615442276, validation loss: 0.6680578589439392
2023-10-09 18:03:24 | INFO     | tempor.models.ts_model:_train:379 | Epoch:0| train loss: 0.6887814402580261, validation loss: 0.6939960718154907
2023-10-09 18:03:24 | INFO     | tempor.models.ts_model:_train:379 | Epoch:0| train loss: 0.6906945109367371, validation loss: 0.6966598033905029
2023-10-09 18:03:24 | INFO     | tempor.benchmarks.benchmark:benchmark_models:100 | Test case: model_2
2023-10-09 18:03:29 | INFO     | tempor.models.ts_ode:_train:607 | Epoch:99| train loss: 0.6465951204299927, validation loss: 0.5632617473602295
2023-10-09 18:03:33 | INFO     | tempor.models.ts_ode:_train:607 | Epoch:99| train loss: 0.913261890411377, validation loss: 0.8132617473602295
2023-10-09 18:03:38 | INFO     | tempor.models.ts_ode:_train:607 | Ep

Results in easily-readable format:


Unnamed: 0,model_1,model_2
aucroc,0.311 +/- 0.083,0.583 +/- 0.312
aucprc,0.385 +/- 0.028,0.588 +/- 0.292
accuracy,0.602 +/- 0.033,0.519 +/- 0.105
f1_score_micro,0.602 +/- 0.033,0.519 +/- 0.105
f1_score_macro,0.375 +/- 0.013,0.338 +/- 0.048
f1_score_weighted,0.453 +/- 0.04,0.361 +/- 0.116
kappa,0.0 +/- 0.0,0.0 +/- 0.0
kappa_quadratic,0.0 +/- 0.0,0.0 +/- 0.0
precision_micro,0.602 +/- 0.033,0.519 +/- 0.105
precision_macro,0.301 +/- 0.016,0.259 +/- 0.053


Full results:

model_1:


Unnamed: 0,mean,stddev
aucroc,0.311111,0.083148
aucprc,0.385185,0.028154
accuracy,0.601852,0.032736
f1_score_micro,0.601852,0.032736
f1_score_macro,0.375458,0.012951
f1_score_weighted,0.452788,0.039572
kappa,0.0,0.0
kappa_quadratic,0.0,0.0
precision_micro,0.601852,0.032736
precision_macro,0.300926,0.016368


model_2:


Unnamed: 0,mean,stddev
aucroc,0.583333,0.311805
aucprc,0.587731,0.291568
accuracy,0.518519,0.105369
f1_score_micro,0.518519,0.105369
f1_score_macro,0.338162,0.047609
f1_score_weighted,0.360713,0.115623
kappa,0.0,0.0
kappa_quadratic,0.0,0.0
precision_micro,0.518519,0.105369
precision_macro,0.259259,0.052684


## Supported tasks

> ⚠️ Not all task types are supported by `benchmark_models` yet.

Supported tasks (for each `task_type` argument):
* `task_type="prediction.one_off.classification"`.
* `task_type="prediction.one_off.regression"`.
* `task_type="time_to_event"`.


## 🎉 Congratulations!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement towards *Machine learning and AI for Medicine*, you can do so in the following ways!



### ⭐ Star [TemporAI](https://github.com/vanderschaarlab/temporai) on GitHub

- The easiest way to help our community is by just starring the repos! This helps raise awareness of the tools we're building.



### Check out other projects from [vanderschaarlab](https://github.com/vanderschaarlab)
- 📝 [HyperImpute](https://github.com/vanderschaarlab/hyperimpute)
- 📊 [AutoPrognosis](https://github.com/vanderschaarlab/autoprognosis)
- 🤖 [SynthCity](https://github.com/vanderschaarlab/synthcity)
 