# Time series classification with Mr-SEQL

From: https://www.sktime.org/en/latest/examples/mrseql.html

## Overview

Mr-SEQL\[1\] is a univariate time series classifier which train linear classification models (logistic regression) with features extracted from multiple symbolic representations of time series (SAX, SFA). The features are extracted by using SEQL\[2\].

\[1\] T. L. Nguyen, S. Gsponer, I. Ilie, M. O’reilly and G. Ifrim Interpretable Time Series Classification using Linear Models and Multi-resolution Multi-domain Symbolic Representations in Data Mining and Knowledge Discovery (DAMI), May 2019, https://link.springer.com/article/10.1007/s10618-019-00633-3

\[2\] G. Ifrim, C. Wiuf “Bounded Coordinate-Descent for Biological Sequence Classification in High Dimensional Predictor Space” (KDD 2011)

In this notebook, we will demonstrate how to use Mr-SEQL for univariate time series classification with the ArrowHead dataset.

### Imports

In [1]:
from sklearn import metrics
from sklearn.model_selection import train_test_split

from sktime.classification.shapelet_based import MrSEQLClassifier
from sktime.datasets import load_arrow_head, load_basic_motions

### Load data

For more details on the data set, see the [univariate time series classification notebook](https://github.com/alan-turing-institute/sktime/blob/main/examples/02_classification_univariate.ipynb).

In [2]:
X, y = load_arrow_head(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

(158, 1) (158,) (53, 1) (53,)


### Train and Test

Mr-SEQL can be configured to run in different mode with different symbolic representation.

`seql_mode` can be either `'clf'` (SEQL as classifier) or `'fs'` (SEQL as feature selection). If `'fs'` mode is chosen, a logistic regression classifier will be trained with the features extracted by SEQL. `'fs'` mode is more accurate in general.

`symrep` can include either `'sax'` or `'sfa'` or both. Using both usually produces a better result.

In [3]:
# Create mrseql object
# use sax by default
ms = MrSEQLClassifier(seql_mode="clf")

# use sfa representations
# ms = MrSEQLClassifier(seql_mode='fs', symrep=['sfa'])

# use sax and sfa representations
# ms = MrSEQLClassifier(seql_mode='fs', symrep=['sax', 'sfa'])

In [4]:
# fit training data
ms.fit(X_train, y_train)

MrSEQLClassifier(seql_mode='clf')

In [5]:
# prediction
predicted = ms.predict(X_test)

In [6]:
# Classification accuracy
print("Accuracy with mr-seql: %2.3f" % metrics.accuracy_score(y_test, predicted))

Accuracy with mr-seql: 0.887


## Train and Test: Multivariate

Mr-SEQL also supports multivariate time series. Mr-SEQL extracts features from each dimension of the data independently.

In [7]:
X, y = load_basic_motions(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

(60, 6) (60,) (20, 6) (20,)


In [8]:
X_train.head()

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5
9,0 -0.407421 1 -0.407421 2 2.355158 3...,0 1.413374 1 1.413374 2 -3.928032 3...,0 0.092782 1 0.092782 2 -0.211622 3...,0 -0.066584 1 -0.066584 2 -3.630177 3...,0 0.223723 1 0.223723 2 -0.026634 3...,0 0.135832 1 0.135832 2 -1.946925 3...
24,0 0.383922 1 0.383922 2 -0.272575 3...,0 0.302612 1 0.302612 2 -1.381236 3...,0 -0.398075 1 -0.398075 2 -0.681258 3...,0 0.071911 1 0.071911 2 -0.761725 3...,0 0.175783 1 0.175783 2 -0.114525 3...,0 -0.087891 1 -0.087891 2 -0.503377 3...
5,0 -0.357300 1 -0.357300 2 -0.005055 3...,0 -0.584885 1 -0.584885 2 0.295037 3...,0 -0.792751 1 -0.792751 2 0.213664 3...,0 0.074574 1 0.074574 2 -0.157139 3...,0 0.159802 1 0.159802 2 -0.306288 3...,0 0.023970 1 0.023970 2 1.230478 3...
7,0 -0.352746 1 -0.352746 2 -1.354561 3...,0 0.316845 1 0.316845 2 0.490525 3...,0 -0.473779 1 -0.473779 2 1.454261 3...,0 -0.327595 1 -0.327595 2 -0.269001 3...,0 0.106535 1 0.106535 2 0.021307 3...,0 0.197090 1 0.197090 2 0.460763 3...
34,0 0.052231 1 0.052231 2 -0.54804...,0 -0.730486 1 -0.730486 2 0.70700...,0 -0.518104 1 -0.518104 2 -1.179430 3...,0 -0.159802 1 -0.159802 2 -0.239704 3...,0 -0.045277 1 -0.045277 2 0.023970 3...,0 -0.029297 1 -0.029297 2 0.29829...


In [9]:
ms = MrSEQLClassifier()
ms

MrSEQLClassifier()

In [10]:
# fit training data
ms.fit(X_train, y_train)

MrSEQLClassifier()

In [11]:
predicted = ms.predict(X_test)

In [12]:
# Classification accuracy
print("Accuracy with mr-seql: %2.3f" % metrics.accuracy_score(y_test, predicted))

Accuracy with mr-seql: 1.000
