In this notebook, you will see how different metrics can be selected when evaluating an algorithm's performance.

In [1]:
import sys
sys.path.append('../..')

%load_ext autoreload
%autoreload 2

In [2]:
from oab.data.load_dataset import load_dataset
from oab.evaluation import EvaluationObject, all_metrics

from pyod.models.iforest import IForest

In [3]:
# load dataset
forest = load_dataset('forest_cover')

Credits: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.


In [4]:
# make experiments with iForest
eval_obj = EvaluationObject("iForest")

for (x, y), settings in forest.sample_multiple(n=50, n_steps=5, contamination_rate=0.1):
    iforest = IForest()
    iforest.fit(x)
    pred = iforest.decision_scores_
    eval_obj.add(y, pred, settings)

In [5]:
# use all metrics for evaluation
_ = eval_obj.evaluate(metrics=all_metrics)

Evaluation on dataset forest_cover with normal labels [2] and anomaly labels [4].
Total of 5 datasets. Per dataset:
50 instances, contamination_rate 0.1.
Mean 	 Std_dev 	 Metric
0.925 	 0.033 		 roc_auc
0.583 	 0.082 		 average_precision
0.536 	 0.091 		 adjusted_average_precision
0.480 	 0.098 		 precision_n
0.422 	 0.109 		 adjusted_precision_n
0.532 	 0.090 		 precision_recall_auc


In [6]:
# to use a subset, first see which ones are available
print(all_metrics)

['roc_auc', 'average_precision', 'adjusted_average_precision', 'precision_n', 'adjusted_precision_n', 'precision_recall_auc']


In [7]:
# select an arbitrary subset
metrics=['roc_auc', 'precision_recall_auc']
_ = eval_obj.evaluate(metrics=metrics)

Evaluation on dataset forest_cover with normal labels [2] and anomaly labels [4].
Total of 5 datasets. Per dataset:
50 instances, contamination_rate 0.1.
Mean 	 Std_dev 	 Metric
0.925 	 0.033 		 roc_auc
0.532 	 0.090 		 precision_recall_auc
