# Convolution transforms

Wildboar implements two convolutional transformation methods `Rocket` and `Hydra`, described by Dempsar et. al. (2020, 2023). Both algorithms employ random convolutional kernels, but in sligtly different manners. In `Rocket`, each kernel is applied to each time series and the maximum activation value and the average number of positive activations are recorded. In `Hydra`, the kernels are partitioned into groups and for each exponential dilation and padding combination each kernel is applied to each time series and the number of times and the number of times each kernel has the highest activation value and the lowest is recorded. Then the features corresponds to the number of times a kernel had the in-group highest activation and the average of the lowest activation.

In [1]:
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import RidgeClassifierCV
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline, make_union
from sklearn.preprocessing import StandardScaler

from wildboar.datasets import load_dataset
from wildboar.datasets.preprocess import SparseScaler
from wildboar.ensemble import ShapeletForestClassifier
from wildboar.transform import DiffTransform, HydraTransform, RocketTransform

For the purpose of this example, we load the `MoteStrain` dataset for the UCR time series archive and split it into two parts: one for fitting the transformation and one for evaluating the predictive performance.

In [2]:
X, y = load_dataset("MoteStrain")
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

## Hydra transform
In Wildboar, we make heavy use of `scikit-learn` functionalities and can employ these features directly. Here, we create a pipeline where we first transform each time series to the representation imposed by `Hydra` (with the default parameters `n_groups=64` and `n_kernels=8`). The subsequent steps of the pipeline applies a sparse scaling which accounts for the sparsity introduced by the transform (remember, we count the number of times a kernel has the highest activation and in many caseses a single kernel never has) and finally applies a standard Ridge classifier to the output.

In [3]:
hydra = make_pipeline(
    HydraTransform(random_state=1, n_jobs=-1),
    SparseScaler(),
    RidgeClassifierCV(),
)

In [4]:
hydra.fit(X_train, y_train)

In [5]:
hydra.score(X_test, y_test)

0.9968553459119497

## Rocket transform
Similarly, we create a pipeline where we use `Rocket` as the first transformation. Instead of the sparse scaler, we here use the traditional normalization to standardize the resulting transformation.

In [6]:
rocket = make_pipeline(
    RocketTransform(n_kernels=10000, random_state=1, n_jobs=-1),
    StandardScaler(),
    RidgeClassifierCV(),
)

In [7]:
rocket.fit(X_train, y_train)

In [8]:
rocket.score(X_test, y_test)

0.9905660377358491

## Hydra transform with first order differences

In the paper (Dempsar, 2023), the `Hydra` transformation is not only computed for the original time series but also for the first order difference. To not inflate the resulting feature space, the authors suggest that half the kernels are allocated for the original input and half for the first order differences. Here, we make use of `make_union` from `scikit-learn` which concatenates two (or more) feature representations and the `DiffTransform` from Wildboar which transforms time series the the nth order difference to construct a new feature representation that consists of both the `Hydra` transform for the original time series and for the first order differences.

In [9]:
hydra_diff = make_pipeline(
    make_union(
        HydraTransform(n_groups=32, random_state=1, n_jobs=-1),
        make_pipeline(
            DiffTransform(),
            HydraTransform(n_groups=32, random_state=2, n_jobs=-1),
        ),
    ),
    SparseScaler(),
    RidgeClassifierCV(),
)

In [10]:
hydra_diff.fit(X_train, y_train)

In [11]:
hydra_diff.score(X_test, y_test)

0.9968553459119497

## Rocket transform with first order differences

Again, we perform the same transformation but substitute `Hydra` for `Rocket`. Similarly, we allocate half the kernels for the original time series and the other half of the kernels for the first order differences.

In [12]:
rocket_diff = make_pipeline(
    make_union(
        RocketTransform(n_kernels=5000, random_state=1, n_jobs=-1),
        make_pipeline(
            DiffTransform(),
            RocketTransform(n_kernels=5000, random_state=2, n_jobs=-1),
        ),
    ),
    StandardScaler(),
    RidgeClassifierCV(),
)

In [13]:
rocket_diff.fit(X_train, y_train)

In [14]:
rocket_diff.score(X_test, y_test)

0.9968553459119497

For this limited example, we can see that both `Hydra` and `Rocket` perform very well without significant differences in resulting predictive performance for the classifier.