# Demo of ROCKET transform

From: https://www.sktime.org/en/latest/examples/rocket.html

## Overview

ROCKET \[1\] transforms time series using random convolutional kernels (random length, weights, bias, dilation, and padding). ROCKET computes two features from the resulting feature maps: the max, and the proportion of positive values (or ppv). The transformed features are used to train a linear classifier.

\[1\] Dempster A, Petitjean F, Webb GI (2019) ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. arXiv:1910.13051

## Contents

1. Imports
1. Univariate Time Series
1. Multivariate Time Series
1. Pipeline Example

## 1 Imports

Import example data, ROCKET, and a classifier (`RidgeClassifierCV` from scikit-learn), as well as NumPy and `make_pipeline` from scikit-learn.

**Note:** ROCKET compiles (via Numba) on import, which may take a few seconds.

In [1]:
# !pip install --upgrade numba

In [2]:
import numpy as np

from sklearn.linear_model import RidgeClassifierCV
from sklearn.pipeline import make_pipeline

from sktime.datasets import load_arrow_head  # univariate dataset
from sktime.datasets import load_japanese_vowels  # multivariate dataset

from sktime.transformations.panel.rocket import Rocket

## 2 Univariate Time Series

We can transform the data using ROCKET and separately fit a classifier, or we can use ROCKET together with a classifier in a pipeline (section 4, below).

### 2.1 Load the Training Data

For more details on the data set, see the [univariate time series classification notebook](https://github.com/alan-turing-institute/sktime/blob/main/examples/02_classification_univariate.ipynb).

In [3]:
X_train, y_train = load_arrow_head(split="train", return_X_y=True)

In [4]:
X_train.head()

Unnamed: 0,dim_0
0,0 -1.9630 1 -1.9578 2 -1.9561 3 ...
1,0 -1.7746 1 -1.7740 2 -1.7766 3 ...
2,0 -1.8660 1 -1.8420 2 -1.8350 3 ...
3,0 -2.0738 1 -2.0733 2 -2.0446 3 ...
4,0 -1.7463 1 -1.7413 2 -1.7227 3 ...


In [5]:
y_train[:5]

array(['0', '1', '2', '0', '1'], dtype='<U1')

### 2.2 Initialise ROCKET and Transform the Training Data

In [6]:
rocket = Rocket()  # by default, ROCKET uses 10,000 kernels
rocket.fit(X_train)
X_train_transform = rocket.transform(X_train)

### 2.3 Fit a Classifier

We recommend using `RidgeClassifierCV` from scikit-learn for smaller datasets (fewer than approx. 20K training examples), and using logistic regression trained using stochastic gradient descent for larger datasets.

In [7]:
classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10), normalize=True)
classifier.fit(X_train_transform, y_train)

RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,
       4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,
       2.15443469e+02, 1.00000000e+03]),
                  normalize=True)

### 2.4 Load and Transform the Test Data

In [8]:
X_test, y_test = load_arrow_head(split="test", return_X_y=True)
X_test_transform = rocket.transform(X_test)

### 2.5 Classify the Test Data

In [9]:
classifier.score(X_test_transform, y_test)

0.8057142857142857

## 3 Multivariate Time Series

We can use ROCKET in exactly the same way for multivariate time series.

### 3.1 Load the Training Data

In [10]:
X_train, y_train = load_japanese_vowels(split="train", return_X_y=True)

In [11]:
X_train.head()

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5,dim_6,dim_7,dim_8,dim_9,dim_10,dim_11
0,0 1.860936 1 1.891651 2 1.939205 3...,0 -0.207383 1 -0.193249 2 -0.239664 3...,0 0.261557 1 0.235363 2 0.258561 3...,0 -0.214562 1 -0.249118 2 -0.291458 3...,0 -0.171253 1 -0.112890 2 -0.041053 3...,0 -0.118167 1 -0.112238 2 -0.102034 3...,0 -0.277557 1 -0.311997 2 -0.383300 3...,0 0.025668 1 -0.027122 2 0.019013 3...,0 0.126701 1 0.171457 2 0.169510 3...,0 -0.306756 1 -0.289431 2 -0.314894 3...,0 -0.213076 1 -0.247722 2 -0.227908 3...,0 0.088728 1 0.093011 2 0.074638 3...
1,0 1.303905 1 1.288280 2 1.332021 3...,0 0.067256 1 0.018672 2 -0.058744 3...,0 0.597720 1 0.631579 2 0.601928 3...,0 -0.271474 1 -0.355112 2 -0.347913 3...,0 -0.236808 1 -0.119216 2 -0.053463 3...,0 -0.411125 1 -0.434425 2 -0.421753 3...,0 -0.014826 1 -0.078036 2 -0.028479 3...,0 0.113175 1 0.178121 2 0.145073 3...,0 -0.058230 1 -0.106430 2 -0.159488 3...,0 -0.173138 1 -0.181910 2 -0.127751 3...,0 0.093058 1 0.093031 2 0.019092 3...,0 0.099247 1 0.099183 2 0.113546 3...
2,0 1.462484 1 1.309815 2 1.418207 3...,0 0.174066 1 0.120183 2 0.015721 3...,0 0.505133 1 0.503046 2 0.589994 3...,0 -0.374302 1 -0.327562 2 -0.310586 3...,0 -0.362125 1 -0.356789 2 -0.477019 3...,0 -0.400335 1 -0.445498 2 -0.367101 3...,0 -0.137429 1 -0.060423 2 -0.120849 3...,0 -0.000830 1 -0.007899 2 0.066952 3...,0 0.053888 1 0.041605 2 -0.023859 3...,0 -0.237630 1 -0.231087 2 -0.224317 3...,0 0.120636 1 0.121053 2 0.175298 3...,0 0.193254 1 0.202386 2 0.156670 3...
3,0 1.160837 1 1.217979 2 1.234654 3...,0 0.078806 1 -0.043693 2 -0.107083 3...,0 0.237706 1 0.378571 2 0.504189 3...,0 -0.010878 1 -0.055125 2 -0.151549 3...,0 -0.393053 1 -0.399601 2 -0.409837 3...,0 -0.744686 1 -0.756213 2 -0.666554 3...,0 0.173073 1 0.189754 2 0.176855 3...,0 -0.012922 1 0.014265 2 0.024257 3...,0 -0.071948 1 -0.099093 2 -0.085188 3...,0 0.028707 1 0.038970 2 0.005654 3...,0 0.074820 1 0.049702 2 -0.007566 3...,0 0.146297 1 0.164537 2 0.168465 3...
4,0 1.665670 1 1.685376 2 1.541171 3...,0 -0.251224 1 -0.305126 2 -0.238987 3...,0 0.309710 1 0.339418 2 0.295073 3...,0 -0.371666 1 -0.455499 2 -0.447638 3...,0 -0.311727 1 -0.259315 2 -0.200163 3...,0 -0.520932 1 -0.502600 2 -0.495071 3...,0 -0.215930 1 -0.195365 2 -0.189373 3...,0 0.255584 1 0.185427 2 0.123212 3...,0 0.048732 1 0.076114 2 0.130086 3...,0 -0.115333 1 -0.106838 2 -0.125721 3...,0 0.063014 1 -0.036998 2 -0.100226 3...,0 0.156787 1 0.200715 2 0.232676 3...


In [12]:
y_train[:5]

array(['0', '0', '0', '0', '0'], dtype='<U1')

In [15]:
np.unique(y_train)

array(['0', '1', '2', '3', '4', '5', '6', '7', '8'], dtype='<U1')

### 3.2 Initialise ROCKET and Transform the Training Data

In [16]:
rocket = Rocket()
rocket.fit(X_train)
X_train_transform = rocket.transform(X_train)

### 3.3 Fit a Classifier

In [17]:
classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10), normalize=True)
classifier.fit(X_train_transform, y_train)

RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,
       4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,
       2.15443469e+02, 1.00000000e+03]),
                  normalize=True)

### 3.4 Load and Transform the Test Data

In [18]:
X_test, y_test = load_japanese_vowels(split="test", return_X_y=True)
X_test_transform = rocket.transform(X_test)

### 3.5 Classify the Test Data

In [19]:
classifier.score(X_test_transform, y_test)

1.0

## 4 Pipeline Example

We can use ROCKET together with `RidgeClassifierCV` (or another classifier) in a pipeline. We can then use the pipeline like a self-contained classifier, with a single call to `fit`, and without having to separately transform the data, etc.

### 4.1 Initialise the Pipeline

In [20]:
rocket_pipeline = make_pipeline(
    Rocket(), 
    RidgeClassifierCV(alphas=np.logspace(-3, 3, 10), normalize=True)
)

### 4.2 Load and Fit the Training Data

In [21]:
X_train, y_train = load_arrow_head(split="train", return_X_y=True)

In [22]:
# it is necessary to pass y_train to the pipeline
# y_train is not used for the transform, but it is used by the classifier
rocket_pipeline.fit(X_train, y_train)

Pipeline(steps=[('rocket', Rocket()),
                ('ridgeclassifiercv',
                 RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,
       4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,
       2.15443469e+02, 1.00000000e+03]),
                                   normalize=True))])

### 4.3 Load and Classify the Test Data

In [23]:
X_test, y_test = load_arrow_head(split="test", return_X_y=True)

rocket_pipeline.score(X_test, y_test)

0.8