# aeon transformers

Transformers are objects that transform data from one representation to another. `aeon`
contains time series specific transformers which can be used in
pipelines in conjunction with other estimators.
Note: the term "transformer" is used in deep learning to refer to specific neural
network architectures. `aeon` transformers follow the `scikit-learn` design: they
have `fit`, `transform` and `fit_transform`  methods that combine the two functions.

`aeon` distinguishes different types of transformer, depending on the input type of
`fit` and `transform`. The main distinction is whether the input is a single time
series or a collection of time series.

### Series Transformers

Single series transformers (in the package
`aeon/transformers/series`) all extend the base class `BaseTransformer`. They are mostly
used in forecasting and are best used with `pd.Series` or `pd.DataFrame` input.
Collections are meant to be used with numpy arrays or lists
of numpy arrays. See the [data storage](examples/datasets/data_Storage.ipynb)
notebook for clarification of how best to store data with aeon.

Transformers also differ in terms of whether they convert time series into
(different) time series, or whether they  convert series into vectors.

To illustrate the difference, we compare two series transformers with different output:

* the Box-Cox transformer `BoxCoxTransformer`, which transforms a time series to a
time series using the [Box Cox power transform](https://en.wikipedia.org/wiki/Power_transform#Box%E2%80%93Cox_transformation)
* the summary transformer `SummaryTransformer`, which transforms a time series to a
vector of summary statistics such as the mean an standard deviation.


In [88]:
from aeon.datasets import load_airline
from aeon.transformations.series.boxcox import BoxCoxTransformer
from aeon.transformations.series.summarize import SummaryTransformer

airline = load_airline()
boxcox_trans = BoxCoxTransformer()
summary_trans = SummaryTransformer()
# airline is a single time series stored in a pd.Series
airline = load_airline()
type(airline)

pandas.core.series.Series

In [89]:
airline[:5]

Period
1949-01    112.0
1949-02    118.0
1949-03    132.0
1949-04    129.0
1949-05    121.0
Freq: M, Name: Number of airline passengers, dtype: float64

In [90]:
# this produces a pandas Series
airline_bc = boxcox_trans.fit_transform(airline)
type(airline_bc)

pandas.core.series.Series

In [91]:
airline_bc[:5]

Period
1949-01    6.827490
1949-02    6.932822
1949-03    7.161892
1949-04    7.114611
1949-05    6.983787
Freq: M, dtype: float64

In [92]:
# this produces a pandas.DataFrame row
airline_summary = summary_trans.fit_transform(airline)
type(airline_summary)

pandas.core.frame.DataFrame

In [87]:
airline_summary[:5]

Unnamed: 0,mean,std,min,max,0.1,0.25,0.5,0.75,0.9
0,280.298611,119.966317,104.0,622.0,135.3,180.0,265.5,360.5,453.2


Some

You can get a list of all series-to-series and series-to-vector transformers as
follows. Please consult the API for details on each

In [1]:
import warnings

from aeon.registry import all_estimators

warnings.filterwarnings("ignore")
all_estimators(
    "transformer",
    filter_tags={
        "scitype:transform-input": "Series",
        "scitype:transform-output": "Series",
    },
    as_dataframe=True,
)

Unnamed: 0,name,estimator
0,Aggregator,<class 'aeon.transformations.hierarchical.aggr...
1,AutoCorrelationTransformer,<class 'aeon.transformations.series.acf.AutoCo...
2,BKFilter,<class 'aeon.transformations.series.bkfilter.B...
3,BoxCoxTransformer,<class 'aeon.transformations.series.boxcox.Box...
4,ClaSPTransformer,<class 'aeon.transformations.series.clasp.ClaS...
...,...,...
68,TransformerPipeline,<class 'aeon.transformations.compose.Transform...
69,TruncationTransformer,<class 'aeon.transformations.collection.trunca...
70,WhiteNoiseAugmenter,<class 'aeon.transformations.series.augmenter....
71,WindowSummarizer,<class 'aeon.transformations.series.summarize....


In [2]:
all_estimators(
    "transformer",
    filter_tags={
        "scitype:transform-input": "Series",
        "scitype:transform-output": "Primitives",
    },
    as_dataframe=True,
)

Unnamed: 0,name,estimator
0,Catch22Wrapper,<class 'aeon.transformations.collection.catch2...
1,FittedParamExtractor,<class 'aeon.transformations.collection.summar...
2,MatrixProfile,<class 'aeon.transformations.collection.matrix...
3,MiniRocket,<class 'aeon.transformations.collection.rocket...
4,MiniRocketMultivariate,<class 'aeon.transformations.collection.rocket...
5,MiniRocketMultivariateVariable,<class 'aeon.transformations.collection.rocket...
6,MultiRocket,<class 'aeon.transformations.collection.rocket...
7,MultiRocketMultivariate,<class 'aeon.transformations.collection.rocket...
8,RandomDilatedShapeletTransform,<class 'aeon.transformations.collection.dilate...
9,RandomIntervalFeatureExtractor,<class 'aeon.transformations.collection.summar...


If your series is split into training and testing data, you should call `fit` and
`transform` separately. `BoxCoxTransformer` has a parameter `lambda` that can be
learned from the train data:

In [93]:
from aeon.forecasting.model_selection import temporal_train_test_split

train, test = temporal_train_test_split(airline)
boxcox = BoxCoxTransformer(method="mle")
test[:5]

Period
1958-01    340.0
1958-02    318.0
1958-03    362.0
1958-04    348.0
1958-05    363.0
Freq: M, Name: Number of airline passengers, dtype: float64

You can then apply the model without refitting lambda using just `transform`:

In [95]:
# fit the transformers
boxcox.fit(train)
# apply to test data
test_new = boxcox.transform(test)
test_new[:5]

Period
1958-01    5.597723
1958-02    5.536036
1958-03    5.655489
1958-04    5.619156
1958-05    5.658029
Freq: M, dtype: float64

Fitted model components of transformers can be found with the `get_fitted_params()`
method:

In [97]:
boxcox.get_fitted_params()
# this is a pandas.DataFrame that contains the fitted transformers

{'lambda': -0.01398297802065717}

#### Forecasting pipeline with transformers

Forecasting is not compatible with `sklearn` pipelines, because `fit` and `predict`
are used differently

In [101]:
from aeon.forecasting.compose import ForecastingPipeline
from aeon.forecasting.naive import NaiveForecaster
from aeon.transformations.series.difference import Differencer

pipe = ForecastingPipeline([Differencer(), NaiveForecaster(strategy="drift")])

# this constructs a TransformedTargetForecaster, which is also a forecaster
pipe

In [103]:
# this is a forecaster with the same interface as NaiveForecaster
# first applies differencer, then naive forecaster, then inverts differencing
pipe.fit(airline, fh=1)
pipe.predict()

1961-01    434.237762
Freq: M, dtype: float64

## Collection Transformers

Collection transformers inherit from `BaseCollectionTransformer`, itself a subclass
of `BaseTransformer`. A `BaseCollectionTransformer` works with the same datastructures
used by clusterers, regressors and classifiers: 3D numpy of shape `(n_cases,
n_channels, n_timepoints)` for equal length series or a list of 2D numpy `[n_cases]`.
 See [data storage notebook](examples/data_storage.ipynb) for more details.

