# Time series resizing with aeon

Suppose we have a collections of time series with different lengths, i.e. different
number of time points. Currently, most of aeon's collection estimators
(classification, clustering or regression) require equal-length time
series. One option is to convert unequal length series into equal length. This can be
 done through padding, truncation or resizing through fitting a function and
 resampling.

In [1]:
from aeon.classification.convolution_based import RocketClassifier
from aeon.datasets import load_basic_motions, load_plaid

## Unequal or equal length collections time series

If a collection contains all equal length series, it will store the data in a 3D
numpy of shape `(n_cases, n_channels, n_timepoints)`. If it is unequal length, it is
stored in a list of 2D numpy arrays:

In [2]:
# Equal length multivariate data
bm_X, bm_y = load_basic_motions()
print(type(bm_X), "\n", bm_X.shape)

<class 'numpy.ndarray'> 
 (80, 6, 100)


In [3]:
# Unequal length univariate data
plaid_X, plaid_y = load_plaid()
print(type(plaid_X), "\n", plaid_X[0].shape, "\n", plaid_X[10].shape)

<class 'list'> 
 (1, 500) 
 (1, 300)


If time series are unequal length, collection estimators will raise an error if they
do not have the capability to handle this characteristic.


In [4]:
rc = RocketClassifier()
try:
    rc.fit(plaid_X, plaid_y)
except ValueError as e:
    print(f"ValueError: {e}")

ValueError: Data seen by instance of RocketClassifier has unequal length series, but RocketClassifier cannot handle unequal length series. 


In [5]:
series_lengths = [array.shape[1] for array in plaid_X]

# Find the minimum and maximum of the second dimensions
min_length = min(series_lengths)
max_length = max(series_lengths)
print(" Min length = ", min_length, " max length = ", max_length)

 Min length =  100  max length =  1344


# Padding, truncating or resizing.

We can pad, truncate or resize. By default, pad adds zeros to make all series the
length of the longest, truncate removes all values beyond the length of the shortest
and resize stretches or shrinks the series.

In [6]:
from aeon.transformations.collection import Padder, Resizer, Truncator

pad = Padder()
truncate = Truncator()
resize = Resizer(length=600)
X2 = pad.fit_transform(plaid_X)
X3 = truncate.fit_transform(plaid_X)
X4 = resize.fit_transform(plaid_X)
print(X2.shape, "\n", X3.shape, "\n", X4.shape)

(1074, 1, 1344) 
 (1074, 1, 100) 
 (1074, 1, 600)


You can put these transformers in a pipeline to apply to both train/test split


In [7]:
from sklearn.metrics import accuracy_score

# Unequal length univariate data
from aeon.pipeline import make_pipeline

train_X, train_y = load_plaid(split="Train")
test_X, test_y = load_plaid(split="Test")
steps = [truncate, rc]
pipe = make_pipeline(steps)
pipe.fit(train_X, train_y)
preds = pipe.predict(test_X)
accuracy_score(train_y, preds)

0.8268156424581006