In [None]:
import os
os.chdir('..')

## Evaluating Experiments -- The Visualiser

This notebook guides you through the evaluation of a custom experiment.

The `Visualiser` is the main class that will us help achieve this.

In [None]:
from activetesting.visualize import Visualiser

%load_ext autoreload
%autoreload 2

Let's say you have run the experiment
```
python main.py +paper=SyntheticGPGP
```
which logs to
```
outputs/final/SyntheticGPGP
```

You can load the results for this experiment with

In [None]:
vis = Visualiser('outputs/final/SyntheticGPGP')

The visualiser then has loads of things for you to look at.

General information about the run:

In [None]:
vis.config()

In [None]:
vis.n_runs

In [None]:
vis.acquisitions

In [None]:
vis.risks

(Note that $\hat{R}_{\text{LURE}} = $ `FancyUnbiasedRiskEstimator` and that $\hat{R}_{\text{iid}} = $ `BiasedRiskEstimator`.)

You can also have a look at the data for the first few runs.

In [None]:
vis.plot_data(0)

If you want to look at the convergence of active testing, there are a couple of tools at your disposal.

First, select combinations of acquisition strategy and risk estimator.

In [None]:
acq_risks = [
    ['RandomAcquisition', 'BiasedRiskEstimator'],
    ['GPSurrogateAcquisitionMSE', 'FancyUnbiasedRiskEstimator'],
    ['TrueLossAcquisition', 'FancyUnbiasedRiskEstimator'],
    ]

In [None]:
vis.plot_risks_select_combinations(acq_risks)

In [None]:
fig, ax = vis.plot_log_convergence(acq_risks[:-1])
ax.set_ylim(1e-7, 1e-2)
ax.set_xscale('linear')
ax.set_xlim(0, 30)

You can also investigate the behaviour of the loss distributions (equivalent to Figure 7 in the paper).

In [None]:
acquisition = 'GPSurrogateAcquisitionMSE'
fig, ax = vis.loss_dist(acquisition, run=0, step=0);

And create animated gifs:

In [None]:
vis.animate_acquisition(acquisition, run=0);
# vis.animate_loss_dist(acquisition, run=0); # does not show data

You can also plot individual runs with

In [None]:
vis.acquisitions

In [None]:
vis.plot_all_runs('GPSurrogateAcquisitionMSE', 'FancyUnbiasedRiskEstimator', break_after=100)

This should have introduced the vast majority of functionality. Please see the the Visualiser class itself for further methods and feel free to reach out with any questions! :) 