# Multivariate time series classification with sktime

In this notebook, we will use sktime for multivariate time series classification.

For the simpler univariate time series classification setting, take a look at this [notebook](https://github.com/alan-turing-institute/sktime/blob/main/examples/02_classification_univariate.ipynb).

### Preliminaries

In [1]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

from sktime.classification.compose import ColumnEnsembleClassifier
from sktime.classification.interval_based import DrCIF
from sktime.classification.kernel_based import RocketClassifier
from sktime.datasets import load_basic_motions
from sktime.transformations.panel.compose import ColumnConcatenator

### Load multivariate time series/panel data

The [data set](http://www.timeseriesclassification.com/description.php?Dataset=BasicMotions) we use in this notebook was generated as part of a student project where four students performed four activities whilst wearing a smart watch. The watch collects 3D accelerometer and a 3D gyroscope It consists of four classes, which are walking, resting, running and badminton. Participants were required to record motion a total of five times, and the data is sampled once every tenth of a second, for a ten second period.

In [2]:
X, y = load_basic_motions(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

(60, 6, 100) (60,) (20, 6, 100) (20,)


In [4]:
#  multivariate input data
X_train.head()

AttributeError: 'numpy.ndarray' object has no attribute 'head'

In [None]:
# multi-class target variable
np.unique(y_train)

## Multivariate classification
sktime offers three main ways of solving multivariate time series classification problems:

1. _Concatenation_ of time series columns into a single long time series column via `ColumnConcatenator` and apply a classifier to the concatenated data,
2. _Column-wise ensembling_ via `ColumnEnsembleClassifier` in which one classifier is fitted for each time series column and their predictions aggregated,
3. _Bespoke estimator-specific methods_ for handling multivariate time series data, e
.g. finding shapelets in multidimensional spaces (still work in progress).

### Time series concatenation
We can concatenate multivariate time series/panel data into long univariate time 
series/panel and then apply a classifier to the univariate data.

In [None]:
steps = [
    ("concatenate", ColumnConcatenator()),
    ("classify", DrCIF(n_estimators=10)),
]
clf = Pipeline(steps)
clf.fit(X_train, y_train)
clf.score(X_test, y_test)

### Column ensembling
We can also fit one classifier for each time series column and then aggregated their predictions. The interface is similar to the familiar `ColumnTransformer` from sklearn.

In [None]:
from sktime.classification.dictionary_based import TemporalDictionaryEnsemble

clf = ColumnEnsembleClassifier(
    estimators=[
        ("Rocket0", RocketClassifier(), [0]),
        ("TDE3", TemporalDictionaryEnsemble(max_ensemble_size=5), [3]),
    ]
)
clf.fit(X_train, y_train)
clf.score(X_test, y_test)

### Bespoke classification algorithms
Another approach is to use bespoke (or classifier-specific) methods for multivariate time series data. 
Here, we try out the RocketClassifier algorithm  in multidimensional space.

In [None]:
clf = RocketClassifier()
clf.fit(X_train, y_train)
clf.score(X_test, y_test)