## Introduction

The goal of this notebook is to demonstrate how to apply the simple baselines proposed in our paper (Section 3.2) to time series anomaly detection and evaluate the results with rigorous metrics (Section 3.5). For simplicity, we use the univariate time series dataset `UCR` as a demo. Since the point-adjusted protocol is flawed, we exclude this in this notebook to discourage its continuous usage within the academic and open-source communities.

In [1]:
%load_ext autoreload
%autoreload 2

## Import packages

In [2]:
# Python 
import sys
from pathlib import Path
import os
import numpy as np
import pandas as pd
import random

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


In [3]:
# Ours
from tad import DEFAULT_RESOURCE_DIR
from tad.dataset.reader import GeneralDataset
from tad.eval_baselines import evaluate_datasets

In [4]:
# UDF 
def highlight_max(s, props=''):
    return np.where(s == np.nanmax(s.values), props, '')

## Simple baseline evaluation

### Step 1. Load data

As a demo, we used `tad.dataset.reader.GeneralDataset` object to load the univariate dataset (Univariate HexagonML (UCR) datasets/InternalBleeding(IB)). The same method can also be used to load the other five commonly-used multi-variates datasets (SWaT, WADI, SMD, SMAP, MSL) included in our paper) .

In [5]:
# List the dataset for loading
datasets_to_eval = [GeneralDataset.UCR_IB_16, GeneralDataset.UCR_IB_17, GeneralDataset.UCR_IB_18, GeneralDataset.UCR_IB_19]

### Step 2. Apply simple baselines and evaluate with standard point-wise metrics


In [6]:
_, df_std = evaluate_datasets(
                      preprocessing="0-1", # the scaling method; see A.2.1. for explanation  
                      eval_method='standard', # the evaluation metrics; `standard`: the point-wise metrics (Main paper, Section 3.5)
                      distance="euclidean", # 
                      skip_useless_datasets=False,
                      datasets_ordered=datasets_to_eval, # the list of dataset for evaluation
)

[INFO]: Evaluating 4 datasets
 Evaluation on UCR_IB_16 finished
 Evaluation on UCR_IB_17 finished
 Evaluation on UCR_IB_18 finished
 Evaluation on UCR_IB_19 finished


In [7]:
df_std.round(3).style.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)

Unnamed: 0_level_0,UCR_IB_16,UCR_IB_16,UCR_IB_16,UCR_IB_16,UCR_IB_17,UCR_IB_17,UCR_IB_17,UCR_IB_17,UCR_IB_18,UCR_IB_18,UCR_IB_18,UCR_IB_18,UCR_IB_19,UCR_IB_19,UCR_IB_19,UCR_IB_19
Unnamed: 0_level_1,F1,P,R,AUPRC,F1,P,R,AUPRC,F1,P,R,AUPRC,F1,P,R,AUPRC
Random,0.013,0.007,0.167,0.002,0.039,0.02,0.811,0.017,0.053,0.03,0.245,0.021,0.007,0.004,0.3,0.003
Sensor Range Deviation,0.004,0.002,1.0,0.001,0.085,0.2,0.054,0.136,0.038,0.02,1.0,0.01,0.004,0.002,1.0,0.001
Distance_to_train_1-NN,0.786,0.688,0.917,0.471,0.973,0.965,0.982,0.992,0.889,0.876,0.902,0.961,0.87,0.769,1.0,0.788
Distance_to train_avg,0.004,0.002,1.0,0.001,0.051,0.028,0.324,0.02,0.047,0.026,0.255,0.018,0.013,0.006,0.6,0.004
PCA_Error(median-iqr norm),0.75,0.6,1.0,0.737,0.974,0.949,1.0,0.996,0.98,0.98,0.98,0.998,1.0,1.0,1.0,1.0
PCA_Error(mean-std norm),0.75,0.6,1.0,0.712,0.974,0.949,1.0,0.997,0.99,0.981,1.0,1.0,1.0,1.0,1.0,1.0
PCA_Error(no norm),0.786,0.688,0.917,0.55,0.982,0.965,1.0,0.986,0.967,0.936,1.0,0.987,0.889,1.0,0.8,0.931
PCA_Error(no norm with Smoothing),0.75,0.6,1.0,0.701,0.974,0.949,1.0,0.996,0.975,0.98,0.971,0.997,1.0,1.0,1.0,1.0
Simple L2_norm,0.011,0.005,1.0,0.003,0.058,0.03,0.748,0.024,0.061,0.032,0.794,0.026,0.017,0.008,1.0,0.005


In [8]:
df_std.round(decimals=3).drop(['P', 'R','AUPRC'], axis=1, level=1).style.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)

Unnamed: 0_level_0,UCR_IB_16,UCR_IB_17,UCR_IB_18,UCR_IB_19
Unnamed: 0_level_1,F1,F1,F1,F1
Random,0.013,0.039,0.053,0.007
Sensor Range Deviation,0.004,0.085,0.038,0.004
Distance_to_train_1-NN,0.786,0.973,0.889,0.87
Distance_to train_avg,0.004,0.051,0.047,0.013
PCA_Error(median-iqr norm),0.75,0.974,0.98,1.0
PCA_Error(mean-std norm),0.75,0.974,0.99,1.0
PCA_Error(no norm),0.786,0.982,0.967,0.889
PCA_Error(no norm with Smoothing),0.75,0.974,0.975,1.0
Simple L2_norm,0.011,0.058,0.061,0.017


### Step 3. Apply simple baselines and evaluate with time-series range-wise metrics


In [9]:
_, df_wagner = evaluate_datasets(
                      preprocessing="0-1",
                      eval_method='wagner2023',
                      distance="euclidean",
                      skip_useless_datasets=False,
                      datasets_ordered=datasets_to_eval)

[INFO]: Evaluating 4 datasets
 Evaluation on UCR_IB_16 finished
 Evaluation on UCR_IB_17 finished
 Evaluation on UCR_IB_18 finished
 Evaluation on UCR_IB_19 finished


In [11]:
df_wagner.round(decimals=3).style.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)

Unnamed: 0_level_0,UCR_IB_16,UCR_IB_16,UCR_IB_16,UCR_IB_16,UCR_IB_17,UCR_IB_17,UCR_IB_17,UCR_IB_17,UCR_IB_18,UCR_IB_18,UCR_IB_18,UCR_IB_18,UCR_IB_19,UCR_IB_19,UCR_IB_19,UCR_IB_19
Unnamed: 0_level_1,F1,P,R,AUPRC,F1,P,R,AUPRC,F1,P,R,AUPRC,F1,P,R,AUPRC
Random,0.01,0.005,0.306,0.004,0.123,0.066,0.839,0.067,0.084,0.044,0.869,0.036,0.017,0.009,0.81,0.006
Sensor Range Deviation,0.0,0.0,1.0,0.001,0.094,0.353,0.054,0.212,0.0,0.0,1.0,0.01,0.0,0.0,1.0,0.001
Distance_to_train_1-NN,0.786,0.688,0.917,0.48,0.969,0.957,0.982,0.992,0.902,0.828,0.99,0.961,0.87,0.769,1.0,0.791
Distance_to train_avg,0.009,0.004,1.0,0.003,0.146,0.088,0.426,0.064,0.094,0.057,0.265,0.039,0.041,0.021,0.7,0.014
PCA_Error(median-iqr norm),0.75,0.6,1.0,0.737,0.974,0.949,1.0,0.996,0.976,0.98,0.971,0.997,1.0,1.0,1.0,1.0
PCA_Error(mean-std norm),0.75,0.6,1.0,0.708,0.974,0.949,1.0,0.997,0.99,0.981,1.0,0.999,1.0,1.0,1.0,1.0
PCA_Error(no norm),0.786,0.688,0.917,0.565,0.982,0.965,1.0,0.985,0.967,0.936,1.0,0.985,0.889,1.0,0.8,0.937
PCA_Error(no norm with Smoothing),0.75,0.6,1.0,0.693,0.974,0.949,1.0,0.996,0.971,0.98,0.961,0.996,1.0,1.0,1.0,1.0
Simple L2_norm,0.021,0.01,1.0,0.006,0.164,0.092,0.734,0.076,0.123,0.067,0.794,0.054,0.05,0.026,1.0,0.016


In [12]:
df_wagner.round(decimals=3).drop(['P', 'R','AUPRC'], axis=1, level=1).style.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)

Unnamed: 0_level_0,UCR_IB_16,UCR_IB_17,UCR_IB_18,UCR_IB_19
Unnamed: 0_level_1,F1,F1,F1,F1
Random,0.01,0.123,0.084,0.017
Sensor Range Deviation,0.0,0.094,0.0,0.0
Distance_to_train_1-NN,0.786,0.969,0.902,0.87
Distance_to train_avg,0.009,0.146,0.094,0.041
PCA_Error(median-iqr norm),0.75,0.974,0.976,1.0
PCA_Error(mean-std norm),0.75,0.974,0.99,1.0
PCA_Error(no norm),0.786,0.982,0.967,0.889
PCA_Error(no norm with Smoothing),0.75,0.974,0.971,1.0
Simple L2_norm,0.021,0.164,0.123,0.05
