# Orchestration of prediction experiments with sktime

* Evaluate the predictive performance one or more strategies on one or more datasets

[Github weblink](https://github.com/alan-turing-institute/sktime/blob/master/examples/experiment_orchestration.ipynb)

In [1]:
from sktime.experiments.orchestrator import Orchestrator
from sktime.experiments.data import DatasetHDD
from sktime.experiments.data import ResultHDD
from sktime.experiments.data import DatasetLoadFromDir
from sktime.experiments.analysis import AnalyseResults
from sktime.experiments.scores import ScoreAccuracy

from sktime.model_selection import PresplitFilesCV
from sktime.highlevel import TSCStrategy
from sktime.highlevel import TSCTask
from sktime.classifiers.ensemble import TimeSeriesForestClassifier

from sklearn.model_selection import KFold
import pandas as pd
import os

In [2]:
# get path to the sktime datasets 
import sktime
repodir = os.path.dirname(sktime.__file__)
datadir = os.path.join(repodir, "datasets/data/")

In [3]:
# create the task and dataset objects manually for each dataset
dts_ArrowHead = DatasetHDD(dataset_loc=os.path.join(datadir, 'ArrowHead'), dataset_name='ArrowHead')
task_ArrowHead = TSCTask(target='target')

dts_Beef = DatasetHDD(dataset_loc=os.path.join(datadir, 'Beef'), dataset_name='Beef')
task_Beef = TSCTask(target='target')

In [4]:
# or create them automatically
dts_loader = DatasetLoadFromDir(root_dir=datadir)
datasets = dts_loader.load_datasets()

selected_datasets = ['ItalyPowerDemand', 'ArrowHead', 'GunPoint']
datasets = [dataset for dataset in datasets if dataset.dataset_name in selected_datasets]

tasks = [TSCTask(target='target') for _ in range(len(datasets))]

In [5]:
# create strategies
clf = TimeSeriesForestClassifier(n_estimators=10)
strategy = TSCStrategy(clf)

# define results output
resultHDD = ResultHDD(results_save_dir=os.path.join(repodir, 'results'),
                      strategies_save_dir=os.path.join(repodir, 'trained_strategies'))

# run orchestrator
orchestrator = Orchestrator(datasets=datasets,
                            tasks=tasks,  
                            strategies=[strategy], 
                            cv=PresplitFilesCV(), 
                            result=resultHDD)
 
orchestrator.run()

In [6]:
# The results list can be obtained from loading the saved csv files by:
results = resultHDD.load()

Having obtained the list of results objects we can compute the score and run some statistical tests

In [7]:
analyse = AnalyseResults(resultHDD)

strategy_dict, losses_df = analyse.prediction_errors(metric= ScoreAccuracy())
pd.DataFrame(strategy_dict, index=selected_datasets)

Unnamed: 0,TimeSeriesForestClassifier
ItalyPowerDemand,0.95428
ArrowHead,0.683908
GunPoint,0.926174


The strategy_dict is used as an argument to the function performing the statistical tests and visualizations. Currently, the following functions are implemented:

* `analyse.average_and_std_error(strategy_dict)`
* `analyse.plot_boxcharts(strategy_dict)`
* `analyse.ranks(strategy_dict)`
* `analyse.t_test(strategy_dict)`
* `analyse.sign_test(strategy_dict)`
* `analyse.ranksum_test(strategy_dict)`
* `analyse.t_test_with_bonferroni_correction(strategy_dict)`
* `analyse.wilcoxon_test(strategy_dict)`
* `analyse.friedman_test(strategy_dict)`
* `analyse.nemenyi(strategy_dict)`