# Time Series Classification Algorithms - (3) Kernel Based

<div style="border: 1px solid #007acc; background-color: #e6f4ff; padding: 10px; border-radius: 5px; color: black;">
  <strong>📘 Info:</strong> I do not claim ownership of the content in this notebook. It is based on official <a href="https://www.aeon-toolkit.org/en/latest/examples.html" target="_blank">aeon tutorials</a>, and parts of the code or text may have been copy-pasted or adapted directly from those or other sources for learning purposes.
</div>

In [1]:
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)

In [2]:
from sklearn import metrics

import time

In [3]:
from aeon.datasets import load_classification

In [4]:
from tslearn.svm import TimeSeriesSVC

In [5]:
DATASET_NAMES = [
    "MelbournePedestrian",
    "ArrowHead",
    "Colposcopy"
]

X_train_dict, y_train_dict = {}, {}
X_test_dict, y_test_dict = {}, {}

accuracy_dict, f1_score_dict, duration_dict = {}, {}, {}

for dataset_name in DATASET_NAMES: 
    X_train_dict[dataset_name], y_train_dict[dataset_name] = load_classification(
        dataset_name, split="train"
    )
    X_test_dict[dataset_name], y_test_dict[dataset_name] = load_classification(
        dataset_name, split="test"
    )

    X_train_dict[dataset_name] = X_train_dict[dataset_name].squeeze(1)
    X_test_dict[dataset_name] = X_test_dict[dataset_name].squeeze(1)

    accuracy_dict[dataset_name] = {}
    f1_score_dict[dataset_name] = {}
    duration_dict[dataset_name] = {}

```bibtex
@inbook{faouzi2024,
  author = {Johann Faouzi},
  title = {Time Series Classification: A Review of Algorithms and Implementations},
  year = {2024},
  month = {March},
  booktitle = {Advances in Time Series Analysis and Forecasting},
  publisher = {IntechOpen},
  isbn = {978-0-85466-053-7},
  doi = {10.5772/intechopen.1004810},
}
```

"Kernel methods are popular machine learning algorithms allowing for nonlinear transformations or decision functions, among which support vector machines are probably the most famous ones and have been successfully used in numerous applications.
Kernel methods rely on a kernel function measuring similarity between any pair of inputs. A key necessary assumption of kernel methods is that the kernel is positive-definite. However, as mentioned in the previous section, DTW is not a distance because it does not satisfy the triangle inequality, implying that DTW cannot be used to define a positive-definite kernel. Although DTW has been used with kernel methods in several publications with some tricks, the fact that the theoretical assumptions are not satisfied is an important limitation.
A true positive-definite kernel for time series, called the global alignment kernel, was proposed. The global alignment kernel is defined as the sum of all the negatively exponentiated costs over all the possible warping paths.
The global alignment kernel has the same computational complexity as DTW, that is, $O(nm)$, because the score between two time series can be computed using a recurrence equation Constraint regions such as the Sakoe-Chiba band and the Itakura parallelogram can also be used with global alignment kernels." \cite{faouzi2024}

## 1. Support Vector Machines (SVM) with the Global Alignment Kernel (GAK)

```bibtex
@inproceedings{cuturi2011,
  title = {Fast Global Alignment Kernels},
  author = {Marco Cuturi},
  booktitle = {Proceedings of the 28th International Conference on Machine Learning (ICML'11)},
  year = {2011},
  pages = {929--936},
  publisher = {Omnipress}
}
```

Proposed by \cite{cuturi2011}.

"Support vector machines with the global alignment kernel have been shown to yield better predictive performances than with other pseudo kernels based on DTW for several multivariate time series classification tasks" \cite{faouzi2024}

In [6]:
method = "SVM-GAK"
print(method)

for dataset_name in DATASET_NAMES:
    start_time = time.time()

    svm_gak = TimeSeriesSVC(kernel='gak')
    svm_gak.fit(X_train_dict[dataset_name], y_train_dict[dataset_name])
    svm_gak_preds = svm_gak.predict(X_test_dict[dataset_name])
    accuracy_dict[dataset_name][method] = metrics.accuracy_score(
        y_test_dict[dataset_name], svm_gak_preds
    )
    f1_score_dict[dataset_name][method] = metrics.f1_score(
        y_test_dict[dataset_name], svm_gak_preds,
        average="weighted"
    )

    elapsed_time = time.time() - start_time
    duration_dict[dataset_name][method] = elapsed_time
    print("-------------------------------------------")
    print(f"Dataset: {dataset_name}")
    print(f"Accuracy: {accuracy_dict[dataset_name][method]:.2f}")
    print(f"F1-Score: {f1_score_dict[dataset_name][method]:.2f}")
    print(f"Duration: {duration_dict[dataset_name][method]:.2f} seconds")

SVM-GAK
-------------------------------------------
Dataset: MelbournePedestrian
Accuracy: 0.82
F1-Score: 0.83
Duration: 125.40 seconds
-------------------------------------------
Dataset: ArrowHead
Accuracy: 0.58
F1-Score: 0.56
Duration: 4.76 seconds
-------------------------------------------
Dataset: Colposcopy
Accuracy: 0.46
F1-Score: 0.35
Duration: 5.52 seconds
