# Overview of the base class structure

`aeon` uses a core inheritance hierarchy of classes across the toolkit, with specialised sub classes in each module. The basic class hierarchy is shown in the following diagram.

<img src="img/aeon_uml_simple.drawio.png" alt="Basic class hierarchy">


## Scikit-learn `BaseEstimator` and aeon `BaseAeonEstimator`

To make sense of this, we break it down from the top.
Everything inherits from sklearn `BaseEstimator`, which mainly handles the mechanisms for getting and setting parameters using the `set_params` and `get_params` methods. These methods are used when the estimators interact with other classes such as [GridSearchCV](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html#sklearn.model_selection.GridSearchCV), and is also used in aeon's `ComposableEstimatorMixin`, which we'll talk about later.

Then we have aeon's `BaseAeonEstimator` class. This class handles the following for all aeon's estimator:
- management of tags, setting, getting, interaction with sklearn's tags, etc.
- cloning and resetting of the estimator
- creation of test instances using test parameters specified by each estimators. For example, this is used to define fast-running estimator (e.g. a forest classifier with only 2 trees) for the CI/CD pipelines.

#### A word on aeon's estimator tag system
Tags in aeon are used for various purposes, to display estimators capabilities in the documentations, to use specific tests based on each estimator's capabilities. You can check [all existing tags in aeon](https://github.com/aeon-toolkit/aeon/blob/main/aeon/utils/tags/_tags.py) and the [developer documentation on the testing framework](https://www.aeon-toolkit.org/en/stable/developer_guide/testing.html#) to know more about how we use tags.

One of the main use of tags is input or output formatting and checking made by base classes. Some important tags in this regard are :
- `X_inner_type` tag, used to specify the expected type of the input data (Arrays, DataFrames, Lists) used in the implementation. For example, this allows estimator to take both numpy arrays and pandas DataFrames as input, while the implementation uses numpy arrays only. 
- `output_data_type` tag, used to specify the expected type of the output data (e.g. tabular, series, collections). This is mostly used by transformations estimators.
- `capability:multivariate` tag, which indicates whether the estimator can handle multivariate time series data.
- `capability:unequal_length` tag, which indicates whether the estimator can handle collection of unequal length time series data.
- `capability:multithreading` tag, which indicates whether the estimator can handle multithreading for parallel processing.

We will give some examples using these tags in the following sections.

## `BaseCollectionEstimator` and `BaseSeriesEstimator`

We distinguish between two types of inputs for aeon estimators, series and collections:
- Series represent single time series as a 2D format `(n_channels, n_timepoints)`, some estimators can also use 1D format as `(n_timepoints)` when they don't support multivariate series. Series estimators also have an `axis` parameter, which allow the input shape to be transposed such as the 2D format becomes `(n_timepoints, n_channels)` instead.
- Collections represent an ensemble of time series as a 3D format `(n_samples, n_channels, n_timepoints)`. Again, this can sometime be represented as a 2D format such as `(n_samples, n_timepoints)` for univariate estimators. Preferably, this should be avoided to clear any confusion on the meaning of axes and the possible confusion with with 2D single series. More information on this problem can be found in [this notebook](series_estimator.ipynb).

For example, if we go back to the base class schema `BaseClassifier` inherit from `BaseCollectionEstimator`. This means that during `fit` and `predict`, all estimators inheriting from `BaseClassifier` will take time series collection as inputs. 


## Collection base estimators

The `BaseCollectionEstimator` defines methods to check the shape of the input, extract metadata (e.g. whether the collection is multivariate) and check compatibility of the input against tags of the estimators. For example, when you do the following : 

In [None]:
from aeon.classification.dictionary_based import TemporalDictionaryEnsemble
from aeon.testing.data_generation import make_example_3d_numpy_list

# TDE does not support unequal length collections
# as it sets "capability: unequal_length":False
X_unequal, y_unequal = make_example_3d_numpy_list()
try:
    TemporalDictionaryEnsemble().fit(X_unequal, y_unequal)
except ValueError as e:
    print(e)

Data seen by instance of TemporalDictionaryEnsemble has unequal length series, but TemporalDictionaryEnsemble cannot handle these characteristics. 


What happens here is that `TemporalDictionaryEnsemble` inherits from `BaseClassifier`, which itself inherits from `BaseCollectionEstimator`. During `fit` and `predict`, `BaseClassifier` calls `_preprocess_collection`, a function defined in `BaseCollectionEstimator`. This function extracts the input metadata (whether it is multivariate, of unequal lengths etc.) and compare it against `TemporalDictionaryEnsemble` tags. These state that the estimator does not support unequal lengths collections, and hence an exception is raised. 

### `BaseClassifier` (aeon.classification)

This is the base class for all classifiers. It uses the standard `fit`, `predict` and `predict_proba` structure from `sklearn`. `fit` and `predict` call the abstract methods `_fit` and `_predict` which are implemented in the subclass to define the classification algorithm. All of the common format checking and conversion are done in final functions such as `fit`, `predict` and are made before calling the abstract methods `_fit` and `_predict`. 

When implementing a new classifier inheriting from `BaseClassifier`, you thus only have to implement the `__init__`, `_fit` and `_predict` methods that handle the classification logic of the classifier. You will also need to set the correct tags to allow the check and conversion to be done for you. Note that each base class also defines some attributes that are commonly used in the estimators, for example `BaseClassifier` exposes `classes_`, `n_classes_`, `_class_dictionary` that we can use in our new classifier:

In [None]:
from numpy.random import default_rng

from aeon.classification import BaseClassifier
from aeon.testing.data_generation import (
    make_example_3d_numpy,
    make_example_dataframe_list,
)


class RandomClassifier(BaseClassifier):
    """A dummy classifier returning random predictions."""

    _tags = {
        "capability:multivariate": True,  # allow multivariate collections
        "capability:unequal_length": True,  # allow multivariate collections
        "X_inner_type": ["np-list", "numpy3D"],  # Specify data format used internally
    }

    def __init__(self, random_state: int = 42):
        self.random_state = random_state
        super().__init__()

    def _fit(self, X, y):
        self.rng = default_rng(self.random_state)
        return self

    def _predict(self, X):
        # generate a random int between 0 and n_classes-1 and use _class_dictionary
        # to convert it to class label
        return [
            self._class_dictionary[i]
            for i in self.rng.integers(low=0, high=self.n_classes_, size=len(X))
        ]


X, y = make_example_3d_numpy(n_channels=2)
print(RandomClassifier().fit_predict(X, y))
X, y = make_example_dataframe_list()
print(RandomClassifier().fit(X, y).predict(X))

[1 0 1 1 0 0 1 0 1 0]
[0, 1, 1, 0, 0, 1, 0, 1, 0, 0]


**Further reading**
- the [classification example notebook](../classification/classification.ipynb).

### `BaseRegressor`, `BaseClusterer` and `BaseCollectionAnomalyDetector` 
These base classes are mostly similar to `BaseClassifier` in how they use the checks and conversion operations from `BaseCollectionEstimator`.

- `BaseRegressor` also defines a `fit`and `predict` method and requires `_fit`and `_predict` methods to be implemented by child classes. The difference is that it has no `predict_proba` method yet, as we still need to decide how to model probabilistic regression for time series. The tests on `y` are also different, as we can have floats has values for `y`.

- `BaseClusterer` also has `fit` and `predict`, but does not take input `y` as child classes can be unsupervised estimators. It does include `predict_proba`.

- `BaseCollectionAnomalyDetector` also has `fit` and `predict`, but does not take input `y` as child classes can be unsupervised estimators.

**Further reading**
- the [regression example notebook](../regression/regression.ipynb).
- the [clustering example notebook](../clustering/clustering.ipynb).
- the [anomaly detection example notebook](../anomaly_detection/anomaly_detection.ipynb).

### `BaseCollectionTransformer` 

Rather than `fit` and`predict`, the `BaseCollectionTransformer` implements `fit`, `transform` and `fit_transform` methods, which are required since this base class also inherits the `BaseTransformer` class. It will require child classes to define `_fit`and `_transform` methods. The output of the transform method is not fixed and should be specified with the `output_data_type`.

For example, if the output is another collection of time series (e.g. after using `SAX`), then `output_data_type` must take the `Collection` value (note that this is the default value for all `BaseCollectionTransformer` child classes). If the output is not time series anymore, but rather a 2D array of  features extracted from each input time series, such as in `Rocket` or `RandomShapeletTransform`, then the `output_data_type` must take the `Tabular`.

**Further reading**
- the [transformation notebook](../transformations/transformations.ipynb).

## Series base estimators

Series estimators are similar to collection estimators, but they take single time series as input. They inherit from `BaseSeriesEstimator`, which perform similar operations to `BaseCollectionEstimator`, such as input checks and conversions, but for single time series.

One important difference is the use of the `axis` parameter, which allows each estimator to define whether it works with the `(n_channels, n_timepoints)` or the `(n_timepoints, n_channels)` 2D format. We need to have the axis parameter, because for 2D series, there is no way to infer whether an input 2D time series is in the `(n_channels, n_timepoints)` or `(n_timepoints, n_channels)` format. 

To understand its uses, we need to distinguish between the `axis` parameter set during initialisation of the estimator, and the `axis` parameter used in the `fit`, `predict` and other methods:
- `axis` given during initialisation is used to define the internal format used by the estimator,
- `axis` given when call functions is used to transpose the input time series if needed, to match the format internally used by the estimator.

Note that the axis value represent the dimension in which the number of timepoints is stored, so `axis=0` means that the timepoints are stored in the first dimension, and `axis=1` means that the timepoints are stored in the second dimension (i.e. `(n_channels, n_timepoints)`).

**Further reading**
- the [series estimators notebook](series_estimator.ipynb).

### `BaseSeriesTransformer`

Let's take the example of `BaseSeriesTransformer`, which is the base class for all series transformers. It implements the `fit`, `transform` and `fit_transform` methods, and requires child classes to implement `_fit` and `_transform`. Let use demonstrate the use of the axis parameter with an example:

In [None]:
from aeon.testing.data_generation import make_example_dataframe_series
from aeon.transformations.series import BaseSeriesTransformer


class DummySeriesTransformer(BaseSeriesTransformer):
    """A dummy series transformer that keeps every second timepoint."""

    _tags = {
        "capability:multivariate": True,  # allow multivariate series
        "X_inner_type": "pd.DataFrame",  # Specify data format used internally
        "fit_is_empty": True,  # we don't need to define _fit
    }

    def __init__(self):
        super().__init__(axis=1)  # Set axis to 1 for (n_channels, n_timepoints) format

    def _transform(self, X, y=None):
        print(X.shape)
        X = X.iloc[:, ::2]  # Example transformation: keep every second timepoint
        print(X.shape)
        return X


X = make_example_dataframe_series(n_channels=2, n_timepoints=10).T
print(X.shape)  # Is (n_timepoints, n_channels), which is axis=0
print(DummySeriesTransformer().fit_transform(X, axis=0).shape)

(10, 2)
(2, 10)
(2, 5)
(5, 2)


What we see is that `X` starts of shape `(n_timepoints, n_channels)`, which is the default format for the input time series. 

The `DummySeriesTransformer` is initialised with `axis=1`, meaning that internally, when calling `fit` and `transform`, the input is converted to `(n_channels, n_timepoints)` format before being passed to the `_fit` and `_transform` functions. The output is then converted back to the original format `(n_timepoints, n_channels)` before returning it.

Note that as we specified pd.DataFrame as the `X_inner_type` in the tags, if the input is not a DataFrame, the estimator will convert it to a DataFrame before applying the transformation. This allows you to define and use a single input shape and type in the function you implement, while still allowing the estimator to handle different input formats.

In [20]:
from aeon.testing.data_generation import make_example_2d_numpy_series

X = make_example_2d_numpy_series(n_channels=1, n_timepoints=10)
transformer = DummySeriesTransformer()
print(transformer.fit_transform(X, axis=1).shape)

(1, 10)
(1, 5)
(1, 5)


### `BaseForecaster`

The `BaseForecaster` class inherits from `BaseSeriesEstimator` which provides the checks and conversion functions for single time series inputs. Forecasters predict future values of a time series `horizon` steps ahead. The `horizon` parameter is defined during initialization to specify how many steps ahead the forecaster should predict. The main methods are :

- `fit(y, exog=None)`: Trains the forecaster on input data
- `predict(y, exog=None)`: Makes predictions on new data
- `forecast(y, exog=None)`: Combines fit and predict into one operation

For child classes, the `_fit` and `_predict` methods must be implemented to define the forecasting logic.

It also provides two main forecasting strategies:
- `direct_forecast()`: Makes predictions by training separate models for each future step.  See [the direct forecasting notebook](../forecasting/direct.ipynb) for more information on this strategy.

- `iterative_forecast()`: Makes predictions recursively using a single trained model. See [the itreative forecasting notebook](../forecasting/iterative.ipynb) for more information on this strategy.


**Further reading**
- the [forecasting example notebook](../forecasting/forecasting.ipynb).

### `BaseSeriesAnomalyDetector` and  `BaseSegmenter`

The `BaseSeriesAnomalyDetector` and the  `BaseSegmenter` base classes both implements the `fit`, `predict` but only requires child classes to implement the `_predict`, as most anomaly detector and segmenters are unsupervised estimators. Both also inherit from `BaseSeriesEstimator`, which provides the checking and conversion functions we have seen before

**Further reading**
- the [anomaly detection example notebook](../anomaly_detection/anomaly_detection.ipynb).
- the [segmentation example notebook](../segmentation/segmentation.ipynb).