# Execute a workflow

A workflow is used to quantitatively evaluate an algorithm on a large set of time series. Here we show how to start a workflow from the code. Notice that all the configurations are dictionaries. Because of this, it is possible to pass a path to a `.json` file, which contains the configuration. More information can be found in the [documentation](https://u0143709.pages.gitlab.kuleuven.be/dtaianomaly/getting_started/experiments.html).

In [1]:
from dtaianomaly.workflows import execute_algorithm
from dtaianomaly.data_management import DataManager
from dtaianomaly.anomaly_detection import PyODAnomalyDetector, Windowing

First we need to specify which time series to use. It is possible to select time series with specific features (e.g., at least 5 attributes). In particular, we select all datasets from the Demo-collection here. Additionally, we also need a `DataManager` to effectively read the data. 

In [2]:
data_manager = DataManager('../data')
data_configuration = {
    'select': [
        {'collection_name': 'Demo'}
    ]
}

An algorithm configuration can be either a dictionary, which is then passed to the corresponding `load()` function of the correct anomaly detector, or it is a `TimeSeriesAnomalyDetector` object.  

In [3]:
anomaly_detector = {
  "name": "iforest_64",
  "anomaly_detector": "PyODAnomalyDetector",
  "pyod_model": "IForest",
  "windowing": {
    "window_size": 64
  }
}

The metric configuration dictates which metrics to be computed. If the metric has certain parameters, then these can be passed under the `"metric_parameters"` key. If the metric can not cope with reel anomaly scores, some thresholding should be applied. This can be done by giving the `"thresholding_strategy"` and `"thresholding_parameters"` properties.

In [4]:
metric_configuration = {
    "roc_vus": { },
    "pr_vus": { },
    "fbeta": {
        # We do not need to provide the 'metric_parameters', because the default value for beta is 1
        "thresholding_strategy": "contamination",
        "thresholding_parameters": {
            "contamination": 0.1
        }
    },
    "fbeta_05": {
        "metric_name": "fbeta",
        "metric_parameters": {
            "beta": 0.5
        },
        "thresholding_strategy": "contamination",
        "thresholding_parameters": {
            "contamination": 0.1
        }
    },
    "fbeta_2": {
        "metric_name": "fbeta",
        "metric_parameters": {
            "beta": 2.0
        },
        "thresholding_strategy": "contamination",
        "thresholding_parameters": {
            "contamination": 0.1
        }
    }
}

Lastly, an output configuration is required. This is not important for algorithm execution itself, but rather for having an indication of what's happening while the workflow is happening. 

In [5]:
output_configuration = {
  "directory_path": "test_workflow",
  "verbose": True,

  "trace_time": True,
  "trace_memory": True,

  "print_results": False,
  "save_results": False,
  "results_file": "results.csv",

  "save_anomaly_scores_plot": True,
  "anomaly_scores_directory": "anomaly_score_plots",
  "anomaly_scores_file_format": "svg",
  "show_anomaly_scores": "overlay",
  "show_ground_truth": None,

  "invalid_train_type_raise_error": True
}

Now, we can execute the workflow simply as follows. 

In [6]:
execute_algorithm(
    data_manager,
    data_configuration, 
    anomaly_detector,
    metric_configuration,
    output_configuration
)

TypeError: cannot unpack non-iterable PyODAnomalyDetector object

The workflow has been save, but for now we will remove the results to clean up the directory.

In [None]:
import shutil
shutil.rmtree(output_configuration['directory_path'])