# ROCKET

Dempster A, Petitjean F, Webb GI (2019) ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. [arXiv:1910.13051](https://arxiv.org/abs/1910.13051)

1. [Overview](#1-Overview)
2. [Requirements](#2-Requirements)
3. [Basic Use](#3-Basic-Use)
4. [Worked Example](#4-Worked-Example)
5. [Reproducing the Experiments](#5-Reproducing-the-Experiments)

# 1 Overview

ROCKET transforms time series using random convolutional kernels (random length, weights, bias, dilation, and padding).  ROCKET computes two features from the resulting feature maps: the max, and the proportion of positive values (or *ppv*).  The transformed features are used to train a linear classifier.

ROCKET is implemented in Python, using just-in-time compilation via Numba.

# 2 Requirements

To use ROCKET, you will need:

* Python (3.7+);
* Numba (0.45.1+);
* NumPy;
* scikit-learn (or equivalent).

All of these should be ready to go in [Anaconda](https://www.anaconda.com/distribution/).

For `reproduce_experiments_bakeoff.py`, we also use pandas (included in Anaconda).

For `reproduce_experiments_scalability.py`, you will also need [PyTorch](https://pytorch.org/) (1.2+).

# 3 Basic Use

Basic use follows this pattern:

```python
# (1) generate random kernels
kernels = generate_kernels(input_length = X_training.shape[1], num_kernels = 10_000)

# (2) transform the training data and train a classifier
X_training_transform = apply_kernels(X = X_training, kernels = kernels)
classifier.fit(X_training_transform, Y_training)

# (3) transform the test data and use the classifier
X_test_transform = apply_kernels(X = X_test, kernels = kernels)
classifier.predict(X_test_transform)
```

**Note**: Time series should be normalised to have a zero mean and unit standard deviation before using `apply_kernels(...)`.

# 4 Worked Example

## 4.1 Import ROCKET

Import the ROCKET functions `generate_kernels(...)`, to generate the random kernels, and `apply_kernels(...)`, to transform the data using the generated kernels; NumPy, to handle the data; and a classifier (here, we use `RidgeClassifierCV` from scikit-learn).

In [1]:
import numpy as np
from sklearn.linear_model import RidgeClassifierCV

from rocket_functions import generate_kernels, apply_kernels

## 4.2 Load the Training Data

Load the training data (here, we use the txt version of the 'coffee' dataset from [timeseriesclassification.com](http://www.timeseriesclassification.com)).

In [2]:
training_data = np.loadtxt("coffee_TRAIN.txt")
Y_training, X_training = training_data[:, 0].astype(np.int), training_data[:, 1:]

## 4.3 (Optional) Precompile ROCKET Functions

Numba provides just-in-time compilation.  For various reasons, we might want to compile the ROCKET functions, `generate_kernels(...)` and `apply_kernels(...)`, before using them.  One way of doing this (i.e., 'forcing' Numba to compile the functions) is as follows.

In [3]:
# generate "dummy" kernels -> compiles *generate_kernels(...)*
_ = generate_kernels(100, 10)

# apply "dummy" kernels to "dummy" data -> compiles *apply_kernels(...)*
_ = apply_kernels(np.zeros_like(training_data)[:, 1:], _)

## 4.4 Generate Kernels

Generate random kernels using `generate_kernels(input_length, num_kernels)`, where `input_length` is the length of the time series, and `num_kernels` is the number of random kernels to generate (here, we generate 100; in our experiments, we use 10,000).

In [4]:
kernels = generate_kernels(X_training.shape[1], 100)

## 4.5 Transform the Training Data

Transform the training data using `apply_kernels(X, kernels)` where `X` are the time series (NumPy array of shape `[num_examples, input_length]`), and `kernels` are the kernels generated using `generate_kernels(...)`.

In [5]:
X_training_transform = apply_kernels(X_training, kernels)

**Note**: Unless already normalised (time series may already be normalised for some datasets), time series should be normalised to have a zero mean and unit standard deviation before using `apply_kernels(...)`.  For example:

```python
# (1) if not already normalised, normalise time series
X_training = (X_training - X_training.mean(axis = 1, keepdims = True)) / X_training.std(axis = 1, keepdims = True)

# (2) then transform the normalised time series
X_training_transform = apply_kernels(X_training, kernels)
```

## 4.6 Train a Classifier

Train a classifier using the transformed features (here, we use `RidgeClassifierCV` from scikit-learn, our suggested classifier for smaller datasets).

In [6]:
classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

print(end = "") # suppress print output of classifier.fit(...)

## 4.7 Load the Test Data

Load the test data.

In [7]:
test_data = np.loadtxt("coffee_TEST.txt")
Y_test, X_test = test_data[:, 0].astype(np.int), test_data[:, 1:]

## 4.8 Transform the Test Data

Transform the test data using `apply_kernels(...)`, as for the training data.

In [8]:
X_test_transform = apply_kernels(X_test, kernels)

**Note**: As for the training data, unless already normalised, time series should be normalised to have a zero mean and unit standard deviation before using `apply_kernels(...)` (see 4.5, above).

## 4.9 Classify the Test Data

Classify the (transformed) test data using the trained classifer.

In [9]:
predictions = classifier.predict(X_test_transform)

print(f"predictions = {', '.join(predictions.astype(str))}")
print(f"accuracy    = {(predictions == Y_test).mean()}") # or classifier.score(X_test_transform, Y_test)

predictions = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
accuracy    = 1.0


# 5 Reproducing the Experiments

## UCR Archive

### 'Bake Off' Datasets

`reproduce_experiments_bakeoff.py` is intended to allow for reproduction of the experiments on the 'bake off' datasets (using the txt versions of the 'bake off' datasets from [timeseriesclassification.com](http://www.timeseriesclassification.com)).

The required arguments are:
* `-i` or `--input_path`, the parent directory for the datasets (probably something like`.../Univariate_arff/`); and
* `-o` or `--output_path`, the directory in which to save the results.

The optional arguments are:
* `-n` or `--num_runs`, the number of runs (default 10); and
* `-k` or `--num_kernels`, the number of kernels (default 10,000).

As ROCKET is nondeterministic, results will differ between runs.  However, any single run should produce representative results.

Examples:

```bash
python reproduce_experiments_bakeoff.py -i ./Univariate_arff/ -o ./
python reproduce_experiments_bakeoff.py -i ./Univariate_arff/ -o ./ -n 1 -k 100
```

### Additional 2018 Datasets

*(Forthcoming...)*

## Scalability

`reproduce_experiments_scalability.py` is intended to:

* allow for reproduction of the scalability experiments (in terms of dataset size); and
* serve as a template for integrating ROCKET with logistic / softmax regression and stochastic gradient descent (or, e.g., Adam) for other large datasets using PyTorch.

The required arguments are:

* `-tr` or `--training_path`, the training dataset (csv);
* `-te` or `--test_path`, the test dataset (csv);
* `-o` or `--output_path`, the directory in which to save the results;
* `-k` or `--num_kernels`, the number of kernels.

**Note**: It may be necessary to adapt the code to your dataset in terms of dataset size and structure, regularisation, etc.

Examples:

```bash
python reproduce_experiments_scalability.py -tr training_data.csv -te test_data.csv -o ./ -k 100
python reproduce_experiments_scalability.py -tr training_data.csv -te test_data.csv -o ./ -k 1_000
python reproduce_experiments_scalability.py -tr training_data.csv -te test_data.csv -o ./ -k 10_000
```