# Orchestration of prediction experiments with sktime

* Evaluate the predictive performance one or more strategies on one or more datasets

[Github weblink](https://github.com/alan-turing-institute/sktime/blob/master/examples/experiment_orchestration.ipynb)

In [1]:
from sktime.experiments.orchestrator import Orchestrator
from sktime.experiments.data import DatasetHDD
from sktime.experiments.data import ResultHDD
from sktime.experiments.data import DatasetLoadFromDir
from sktime.experiments.analysis import AnalyseResults
from sktime.experiments.scores import ScoreAccuracy

from sktime.model_selection import PresplitFilesCV
from sktime.highlevel.strategies import TSCStrategy
from sktime.highlevel.tasks import TSCTask
from sktime.classifiers.compose.ensemble import TimeSeriesForestClassifier

from sklearn.model_selection import KFold
import pandas as pd
import os

In [2]:
# get path to the sktime datasets 
import sktime
repodir = os.path.dirname(sktime.__file__)
datadir = os.path.join(repodir, "datasets/data/")
resultsdir = 'results'



In [3]:
# create the task and dataset objects manually for each dataset
dts_Italy = DatasetHDD(dataset_loc=os.path.join(datadir, 'ItalyPowerDemand'), dataset_name='ItalyPowerDemand')
task_Italy = TSCTask(target='target')

dts_ArrowHead = DatasetHDD(dataset_loc=os.path.join(datadir, 'ArrowHead'), dataset_name='ArrowHead')
task_ArrowHead = TSCTask(target='target')

dts_GunPoint = DatasetHDD(dataset_loc=os.path.join(datadir, 'GunPoint'), dataset_name='GunPoint')
task_GunPoint = TSCTask(target='target')

datasets=[dts_ArrowHead, dts_Italy, dts_GunPoint]
tasks=[task_ArrowHead, task_Italy, task_GunPoint]

In [4]:

# or create them automatically
dts_loader = DatasetLoadFromDir(root_dir=datadir)
datasets = dts_loader.load_datasets()

selected_datasets = ['ItalyPowerDemand', 'ArrowHead', 'GunPoint']
datasets = [dataset for dataset in datasets if dataset.dataset_name in selected_datasets]

tasks = [TSCTask(target='target') for _ in range(len(datasets))]

clf = TimeSeriesForestClassifier(n_estimators=10, random_state=1)
strategy = TSCStrategy(clf)

resultHDD = ResultHDD(results_save_dir=os.path.join(datadir, 'results'),
                      strategies_save_dir=os.path.join(datadir, 'trained_strategies'))

In [5]:
# run orchestrator
orchestrator = Orchestrator(datasets=datasets,
                            tasks=tasks,  
                            strategies=[strategy], 
                            cv=PresplitFilesCV(), 
                            result=resultHDD)
 
orchestrator.run()


In [6]:
# The results list can be obtained from loading the saved csv files by:
results = resultHDD.load()

analyse = AnalyseResults(resultHDD)

strategy_dict, losses_df = analyse.prediction_errors(metric=ScoreAccuracy())

losses_df['Accuacy'] = 1- losses_df['loss']
losses_df

Unnamed: 0,Unnamed: 1,loss,std_error,Accuacy
GunPoint,TimeSeriesForestClassifier,0.053691,0.018466,0.946309
ItalyPowerDemand,TimeSeriesForestClassifier,0.044747,0.006448,0.955253
ArrowHead,TimeSeriesForestClassifier,0.275862,0.033883,0.724138


Having obtained the list of results objects we can compute the score and run some statistical tests

The strategy_dict is used as an argument to the function performing the statistical tests and visualizations. Currently, the following functions are implemented:

* `analyse.average_and_std_error(strategy_dict)`
* `analyse.plot_boxcharts(strategy_dict)`
* `analyse.ranks(strategy_dict)`
* `analyse.t_test(strategy_dict)`
* `analyse.sign_test(strategy_dict)`
* `analyse.ranksum_test(strategy_dict)`
* `analyse.t_test_with_bonferroni_correction(strategy_dict)`
* `analyse.wilcoxon_test(strategy_dict)`
* `analyse.friedman_test(strategy_dict)`
* `analyse.nemenyi(strategy_dict)`