Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] New base classes: base series estimator , segmenters and base series transformers #996

Merged
merged 134 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
134 commits
Select commit Hold shift + click to select a range
787fe10
switch test example for pipeline
TonyBagnall Oct 21, 2023
174fff5
switch test example for pipeline
TonyBagnall Oct 21, 2023
e4d4b3e
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 22, 2023
bd75ab3
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 22, 2023
dc4d82c
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 23, 2023
60027c5
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 25, 2023
4fbcba1
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 26, 2023
a500a65
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Oct 30, 2023
50195ee
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 1, 2023
ad9cb95
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 2, 2023
ad5686a
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 3, 2023
811f975
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 10, 2023
18ed16a
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 17, 2023
5c8b927
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 19, 2023
c66fe4b
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 22, 2023
3b65ff1
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 23, 2023
287fb8d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 23, 2023
89aad8b
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 24, 2023
af7ba63
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 25, 2023
2f67762
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Nov 27, 2023
1465a8e
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 1, 2023
e38d3d5
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 3, 2023
774e95f
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 4, 2023
19b1486
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 4, 2023
be9058a
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 4, 2023
4e8b846
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 4, 2023
8d192d7
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 4, 2023
67789fb
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 5, 2023
18ca506
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 5, 2023
5124852
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 5, 2023
d1a620d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 6, 2023
f48eb2a
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 11, 2023
6e0f7a3
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 11, 2023
3c94403
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 12, 2023
00dca42
BaseSeries
TonyBagnall Dec 15, 2023
4594206
BaseSeries
TonyBagnall Dec 15, 2023
1abc17f
base series
TonyBagnall Dec 15, 2023
ba7996a
MP transformer
TonyBagnall Dec 15, 2023
df54528
collection wrapper
TonyBagnall Dec 15, 2023
7f207d6
Merge branch 'main' into ajb/base_series
TonyBagnall Dec 18, 2023
b91115d
docstring for MP
TonyBagnall Dec 18, 2023
1b730fc
set axis
TonyBagnall Dec 18, 2023
b506ca6
Merge branch 'ajb/base_series' of https://github.com/aeon-toolkit/aeo…
TonyBagnall Dec 18, 2023
3c46f78
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 18, 2023
8ee7572
add to registry
TonyBagnall Dec 18, 2023
426d0e8
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 18, 2023
c05bb35
add tag
TonyBagnall Dec 19, 2023
10e7ec8
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 19, 2023
24f0315
Merge branch 'main' into ajb/base_series
TonyBagnall Dec 19, 2023
44eb428
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 20, 2023
939568b
Merge branch 'main' into ajb/base_series
TonyBagnall Dec 20, 2023
f6bea6b
docstring
TonyBagnall Dec 20, 2023
8565ac6
docstring
TonyBagnall Dec 20, 2023
c3fb39d
docstring
TonyBagnall Dec 20, 2023
7e6447b
docstring
TonyBagnall Dec 20, 2023
d091cfa
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 20, 2023
37263f8
Merge branch 'main' into ajb/base_series
TonyBagnall Dec 20, 2023
ffb61eb
try revert name to MatrixProfileTransformer
TonyBagnall Dec 20, 2023
c45699a
rename series transformer
TonyBagnall Dec 20, 2023
7e8929c
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 29, 2023
af8ba8c
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Dec 30, 2023
4538d69
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 3, 2024
0c0a3ae
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 6, 2024
4a58e5b
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 8, 2024
c5b0f5a
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 8, 2024
a519bbb
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 8, 2024
d3ac259
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 9, 2024
9ffe169
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 11, 2024
219e850
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 12, 2024
b17cac9
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 12, 2024
e2bb953
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 13, 2024
c824ef7
remove reference to segmenter
TonyBagnall Jan 13, 2024
6550a20
inherit from BaseTransformer for tests
TonyBagnall Jan 13, 2024
76c3bb5
revert inheritance
TonyBagnall Jan 13, 2024
ead5134
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 13, 2024
45f98d6
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 15, 2024
3dcd264
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 17, 2024
8d0070d
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 17, 2024
b6e2863
base series estimator
TonyBagnall Jan 17, 2024
553f5b8
base series estimator
TonyBagnall Jan 17, 2024
88dc724
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 17, 2024
3309aeb
add update
TonyBagnall Jan 17, 2024
4c74a02
clasp to series
TonyBagnall Jan 17, 2024
74dd8af
clasp to series
TonyBagnall Jan 18, 2024
2a44bbc
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 18, 2024
df9ece7
undo clasp changes
TonyBagnall Jan 18, 2024
578cda9
refactor base segmenter
TonyBagnall Jan 18, 2024
ef207b4
standardise
TonyBagnall Jan 18, 2024
1ed4796
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 18, 2024
0b1bf1b
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 18, 2024
232c511
tests for base segmenter
TonyBagnall Jan 19, 2024
cd1bced
tests for base segmenter
TonyBagnall Jan 19, 2024
1235e69
tests for matrix profile
TonyBagnall Jan 19, 2024
9174b1b
tests for base transformer
TonyBagnall Jan 19, 2024
8e8b964
move check_y up
TonyBagnall Jan 19, 2024
6cbe93d
exceptions in infer class
TonyBagnall Jan 19, 2024
fe656cf
fix to_classification
TonyBagnall Jan 19, 2024
25083e7
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 19, 2024
3029a47
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 19, 2024
cc64ee3
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 19, 2024
071771f
revised tests for segmenters
TonyBagnall Jan 19, 2024
bd20174
MP test
TonyBagnall Jan 19, 2024
3b8fff2
remove example of private method with softdep
TonyBagnall Jan 20, 2024
6d49e92
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 21, 2024
b9f9864
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 21, 2024
660fb52
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 24, 2024
283946d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 24, 2024
a558fce
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 24, 2024
7bf7c51
refactor MPTrans until old one deprecated
TonyBagnall Jan 24, 2024
051570e
docs
TonyBagnall Jan 24, 2024
a7d0a4d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 24, 2024
d6b25e4
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 25, 2024
ef8ed2d
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 26, 2024
928a8ab
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 26, 2024
07b75d6
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 27, 2024
b309aa1
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 27, 2024
10b1ddd
Merge branch 'main' of https://github.com/aeon-toolkit/aeon
TonyBagnall Jan 28, 2024
34a5708
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 28, 2024
531f1f1
[pre-commit.ci lite] apply automatic fixes
pre-commit-ci-lite[bot] Jan 28, 2024
4734206
take private
TonyBagnall Jan 29, 2024
ab98ef6
revert base series
TonyBagnall Jan 29, 2024
748574e
Merge branch 'main' into ajb/base_series
TonyBagnall Jan 29, 2024
6fa6798
[pre-commit.ci lite] apply automatic fixes
pre-commit-ci-lite[bot] Jan 29, 2024
59021c2
remove attrs
TonyBagnall Jan 29, 2024
06377d2
comment rework
TonyBagnall Feb 1, 2024
1aee903
to_classification
TonyBagnall Feb 1, 2024
e5749c0
docs
TonyBagnall Feb 1, 2024
3a21256
MatrixProfileTransformer
TonyBagnall Feb 1, 2024
5b175ae
comment
TonyBagnall Feb 1, 2024
b23f8b1
remove random get_test_params
TonyBagnall Feb 1, 2024
52d56d2
add base to API
TonyBagnall Feb 1, 2024
5ed7523
change base segmenter test parameters
TonyBagnall Feb 1, 2024
d2c962d
docstrings
TonyBagnall Feb 1, 2024
ccac1c5
docstrings
TonyBagnall Feb 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions aeon/base/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,13 @@
"BaseObject",
"BaseEstimator",
"BaseCollectionEstimator",
"BaseSeriesEstimator",
"_HeterogenousMetaEstimator",
"load",
]

from aeon.base._base import BaseEstimator, BaseObject
from aeon.base._base_collection import BaseCollectionEstimator
from aeon.base._base_series import BaseSeriesEstimator
from aeon.base._meta import _HeterogenousMetaEstimator
from aeon.base._serialize import load
6 changes: 3 additions & 3 deletions aeon/base/_base_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,9 @@ def _preprocess_collection(self, X):
def _check_X(self, X):
"""Check classifier input X is valid.

Check if the input data is a compatible type, and that this classifier is
Check if the input data is a compatible type, and that this estimator is
able to handle the data characteristics. This is done by matching the
capabilities of the classifier against the metadata for X for
capabilities of the estimator against the metadata for X for
univariate/multivariate, equal length/unequal length and no missing
values/missing values.

Expand Down Expand Up @@ -123,7 +123,7 @@ def _check_X(self, X):
>>> import numpy as np
>>> X = np.random.random(size=(5,3,10)) # X is equal length, multivariate
>>> hc = HIVECOTEV2()
>>> m = hc._check_X(X) # HC2 can handle this
>>> meta=hc._check_X(X) # HC2 can handle this
"""
metadata = self._get_metadata(X)
# Check classifier capabilities for X
Expand Down
253 changes: 253 additions & 0 deletions aeon/base/_base_series.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
"""Base class for estimators that fit single (possibly multivariate) time series."""

import numpy as np
import pandas as pd

from aeon.base._base import BaseEstimator
from aeon.utils.validation._dependencies import _check_estimator_deps

# allowed input and internal data types for Series
VALID_INNER_TYPES = [
"np.ndarray",
"pd.Series",
"pd.DataFrame",
]
VALID_INPUT_TYPES = [pd.DataFrame, pd.Series, np.ndarray]


class BaseSeriesEstimator(BaseEstimator):
"""Base class for estimators that use single (possibly multivariate) time series.

Provides functions that are common to BaseSeriesEstimator objects, including
BaseSeriesTransformer and BaseSegmenter, for the checking and
conversion of input to fit, predict and transform, where relevant.

It also stores the common default tags used by all the subclasses and meta data
describing the characteristics of time series passed to ``fit``.

input and internal data format
Univariate series:
Numpy array:
shape `(m,)`, `(m, 1)` or `(1, m)`. if ``self`` has no multivariate
capability, i.e.``self.get_tag(
""capability:multivariate") == False``, all are converted to 1D
numpy `(m,)`
if ``self`` has multivariate capability, converted to 2D numpy `(m,1)` or
`(1, m)` depending on axis
pandas DataFrame or Series:
DataFrame single column shape `(m,1)`, `(1,m)` or Series shape `(m,)`
if ``self`` has no multivariate capability, all converted to Series `(m,)`
if ``self`` has multivariate capability, all converted to Pandas DataFrame
shape `(m,1)`, `(1,m)` depending on axis
TonyBagnall marked this conversation as resolved.
Show resolved Hide resolved

Multivariate series:
Numpy array, shape `(m,d)` or `(d,m)`.
pandas DataFrame `(m,d)` or `(d,m)`

Parameters
----------
axis : int, default = 0
Axis along which to segment if passed a multivariate series (2D input). If axis
is 0, it is assumed each column is a time series and each row is a
timepoint. i.e. the shape of the data is ``(n_timepoints,n_channels)``.
``axis == 1`` indicates the time series are in rows, i.e. the shape of the data
is ``(n_channels, n_timepoints)``.
"""

_tags = {
"capability:univariate": True,
"capability:multivariate": False,
"capability:missing_values": False,
"X_inner_type": "np.ndarray", # one of VALID_INNER_TYPES
}

def __init__(self, axis=0):
self.axis = axis
self.metadata_ = {} # metadata/properties of data seen in fit
super().__init__()
_check_estimator_deps(self)

def _check_X(self, X):
"""Check classifier input X is valid.

Check if the input data is a compatible type, and that this estimator is
able to handle the data characteristics. This is done by matching the
capabilities of the estimator against the metadata for X for
univariate/multivariate, equal length/unequal length and no missing
values/missing values.

Parameters
----------
X : data structure
A valid aeon collection data structure. See
aeon.utils.validation.collection.COLLECTIONS_DATA_TYPES for details
on aeon supported data structures.

Returns
-------
dict
Meta data about X, with flags:
metadata["missing_values"] : whether X has missing values or not
metadata["multivariate"] : whether X has more than one channel or not

See Also
--------
_convert_X : function that converts X after it has been checked.
"""
# Checks: check valid type and axis
if type(X) not in VALID_INPUT_TYPES:
raise ValueError(
f"Error in input type should be one of "
f" {VALID_INNER_TYPES}, saw {type(X)}"
)
if isinstance(X, np.ndarray):
# Check valid shape
if X.ndim > 2:
raise ValueError("Should be 1D or 2D")
if not (
issubclass(X.dtype.type, np.integer)
or issubclass(X.dtype.type, np.floating)
):
raise ValueError("np.ndarray must contain floats or ints")
elif isinstance(X, pd.Series):
if not pd.api.types.is_numeric_dtype(X):
raise ValueError("pd.Series must be numeric")
else:
if not all(pd.api.types.is_numeric_dtype(X[col]) for col in X.columns):
raise ValueError("pd.DataFrame must be numeric")
# If X is a single series dataframe, we squeeze it into Series in convert_X
X = X.squeeze()
metadata = {}
metadata["multivariate"] = False
# Need to differentiate because a 1D series stored in a dataframe will have
# ndim=2. This case is dealt with in convert through squeezing to 1D
if X.ndim > 1:
metadata["multivariate"] = True
if isinstance(X, np.ndarray):
metadata["missing_values"] = np.isnan(X).any()
elif isinstance(X, pd.Series):
metadata["missing_values"] = X.isna().any()
elif isinstance(X, pd.DataFrame):
metadata["missing_values"] = X.isna().any().any()
allow_multivariate = self.get_tag("capability:multivariate")
allow_univariate = self.get_tag("capability:univariate")
allow_missing = self.get_tag("capability:missing_values")
if metadata["missing_values"] and not allow_missing:
raise ValueError("Missing values not supported")
if metadata["multivariate"] and not allow_multivariate:
raise ValueError("Multivariate data not supported")
if not metadata["multivariate"] and not allow_univariate:
raise ValueError("Univariate data not supported")
return metadata

def _check_y(self, y: VALID_INPUT_TYPES):
"""Check y specific to segmentation.

y must be a univariate series
"""
if type(y) not in VALID_INPUT_TYPES:
raise ValueError(
f"Error in input type for y: it should be one of "
f"{VALID_INPUT_TYPES}, saw {type(y)}"
)
if isinstance(y, np.ndarray):
# Check valid shape
if y.ndim > 1:
raise ValueError(
"Error in input type for y: y input as np.ndarray " "should be 1D"
TonyBagnall marked this conversation as resolved.
Show resolved Hide resolved
)
if not (
issubclass(y.dtype.type, np.integer)
or issubclass(y.dtype.type, np.floating)
):
raise ValueError(
"Error in input type for y: y input must contain " "floats or ints"
)
elif isinstance(y, pd.Series):
if not pd.api.types.is_numeric_dtype(y):
raise ValueError(
"Error in input type for y: y input as pd.Series must be numeric"
)
else: # pd.DataFrame
if y.shape[1] > 2:
raise ValueError(
"Error in input type for y: y input as pd.DataFrame "
"should have a single "
"column series"
)

if not all(pd.api.types.is_numeric_dtype(y[col]) for col in y.columns):
raise ValueError(
"Error in input type for y: y input as pd.DataFrame "
"must be numeric"
)

def _convert_X(self, X, axis):
inner = self.get_tag("X_inner_type").split(".")[-1]
input = type(X).__name__
if inner != input:
if inner == "ndarray":
X = X.to_numpy()
elif inner == "Series":
if input == "ndarray":
X = pd.Series(X)
elif inner == "DataFrame":
X = pd.DataFrame(X)
else:
tag = self.get_tag("X_inner_type")
raise ValueError(f"Unknown inner type {inner} derived from {tag}")
if axis > 1 or axis < 0:
raise ValueError("Axis should be 0 or 1")
if not self.get_tag("capability:multivariate"):
X = X.squeeze()
elif X.ndim == 1: # np.ndarray case make 2D
X = X.reshape(1, -1)
if X.ndim > 1:
if self.axis != axis:
X = X.T
return X

def _preprocess_series(self, X, axis=None):
"""Preprocess input X prior to call to fit.

Checks the characteristics of X, store metadata, checks self can handle
the data then convert X to X_inner_type

Parameters
----------
X : one of VALID_INNER_TYPES
axis: int or None

Returns
-------
Data structure of type self.tags["X_inner_type"]

See Also
--------
_check_X : function that checks X is valid before conversion.
_convert_X : function that converts to inner type.
pass
"""
if axis is None:
axis = self.axis
meta = self._check_X(X)
if len(self.metadata_) == 0:
self.metadata_ = meta
return self._convert_X(X, axis)

@classmethod
def get_test_params(cls, parameter_set="default"):
"""
Return testing parameter settings for the estimator.

Parameters
----------
parameter_set : str, default="default"

Returns
-------
params : dict or list of dict, default = {}
Parameters to create testing instances of the class.
"""
# default parameters = empty dict
return {"axis": 0}
2 changes: 1 addition & 1 deletion aeon/base/tests/test_base_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ def test__convert_X(internal_type, data):


@pytest.mark.parametrize("data", COLLECTIONS_DATA_TYPES)
def test_preprocess_fit(data):
def test_preprocess_collection(data):
"""Test the functionality for preprocessing fit."""
data = EQUAL_LENGTH_UNIVARIATE[data]
cls = BaseCollectionEstimator()
Expand Down
Loading
Loading