Skip to content

Commit

Permalink
[ENH] dunders for time series distances and kernels (#3949)
Browse files Browse the repository at this point in the history
This PR adds dunders for time series distances and kernels (descendants of `BasePairwiseTransformerPanel`), behaving as per likely user expectation, description as below.

It also adds tests for the different combinations of dunders and estimators.

### algebraic operations between time series distances and kernels

* `d = dist1 * dist2` satisfies `d(X1, X2) == dist1(X1, X2) * dist2(X1, X2)`, for all pairwise transformers `dist1`, `dist2`, equality for all elements of the resulting matrix
* `d = dist1 + dist2` gives `d(X1, X2) == dist1(X1, X2) + dist2(X1, X2)`, for all pairwise transformers `dist1`, `dist2`
* for a pairwise distance `dist` and an int or float `const`, `d = dist * const` or `d = const * dist` gives `d(X1, X2) == dist(X1, X2) * const`
* for a pairwise distance `dist` and an int or float `const`, `d = dist + const` or `d = const + dist` gives `d(X1, X2) == dist(X1, X2) + const`

### pipeline concatenation between ordinary transformers and time series distances or kernels

* for a transformer `trafo` and a pairwise `dist`, the esteimator `pipe = trafo * dist` is also a pairwise distance, with `pipe(X1, X2) == dist(trafo.fit_transform(X1), trafo.fit_transform(X2))`
* above, the transformer `trafo` can be an `sktime` transformer, or an `sklearn` transformer (which is coerced/wrapped)

This especially may be interesting for users with a research interest in time series classification or clustering, as it allows to obtain common time series distances easily as composition of others.

E.g., ddtw (for common definitions of ddtw) is the same as `Differencer() * DtwDist()` (first difference, then dtw distance). Higher order differences or other combinations are also easy to obtain by this.
  • Loading branch information
fkiraly committed Dec 27, 2022
1 parent 6441e14 commit 65a5304
Show file tree
Hide file tree
Showing 4 changed files with 382 additions and 3 deletions.
26 changes: 23 additions & 3 deletions sktime/base/_meta.py
Expand Up @@ -402,7 +402,13 @@ def _make_strings_unique(self, strlist):
return self._make_strings_unique(uniquestr)

def _dunder_concat(
self, other, base_class, composite_class, attr_name="steps", concat_order="left"
self,
other,
base_class,
composite_class,
attr_name="steps",
concat_order="left",
composite_params=None,
):
"""Concatenate pipelines for dunder parsing, helper function.
Expand All @@ -426,6 +432,9 @@ def _dunder_concat(
concat_order : str, one of "left" and "right", optional, default="left"
if "left", result attr_name will be like self.attr_name + other.attr_name
if "right", result attr_name will be like other.attr_name + self.attr_name
composite_params : dict, optional, default=None; else, pairs strname-value
if not None, parameters of the composite are always set accordingly
i.e., contains key-value pairs, and composite_class has key set to value
Returns
-------
Expand Down Expand Up @@ -488,11 +497,22 @@ def concat(x, y):
else:
return NotImplemented

# create the "steps" param for the composite
# if all the names are equal to class names, we eat them away
if all(type(x[1]).__name__ == x[0] for x in zip(new_names, new_ests)):
return composite_class(**{attr_name: list(new_ests)})
step_param = {attr_name: list(new_ests)}
else:
return composite_class(**{attr_name: list(zip(new_names, new_ests))})
step_param = {attr_name: list(zip(new_names, new_ests))}

# retrieve other parameters, from composite_params attribute
if composite_params is None:
composite_params = {}
else:
composite_params = composite_params.copy()

# construct the composite with both step and additional params
composite_params.update(step_param)
return composite_class(**composite_params)

def _anytagis(self, tag_name, value, estimators):
"""Return whether any estimator in list has tag `tag_name` of value `value`.
Expand Down
121 changes: 121 additions & 0 deletions sktime/dists_kernels/_base.py
Expand Up @@ -228,6 +228,127 @@ def __call__(self, X, X2=None):
# this just defines __call__ as an alias for transform
return self.transform(X=X, X2=X2)

def __mul__(self, other):
"""Magic * method, return (right) multiplied CombinedDistance.
Implemented for `other` being:
* a pairwise panel transformer, then `CombinedDistance([other, self], "*")`
Parameters
----------
other: one of:
* `sktime` transformer, must inherit from BaseTransformer,
otherwise, `NotImplemented` is returned (leads to further dispatch by rmul)
Returns
-------
CombinedDistance object,
algebraic multiplication of `self` (first) with `other` (last).
not nested, contains only non-CombinedDistance `sktime` transformers
"""
from sktime.dists_kernels.algebra import CombinedDistance
from sktime.dists_kernels.dummy import ConstantPwTrafoPanel

# when other is an integer or float, treat it as constant distance/kernel
if isinstance(other, (int, float)):
other = ConstantPwTrafoPanel(constant=other)

# we wrap self in a CombinedDistance, and concatenate with the other
# the CombinedDistance does the rest, e.g., dispatch on other
if isinstance(other, BasePairwiseTransformerPanel):
if not isinstance(self, CombinedDistance):
self_as_pipeline = CombinedDistance(pw_trafos=[self], operation="*")
else:
self_as_pipeline = self
return self_as_pipeline * other
# otherwise, we let the right operation handle the remaining dispatch
else:
return NotImplemented

def __rmul__(self, other):
"""Magic * method, return (right) PwTrafoPanelPipeline or CombinedDistance.
Implemented for `other` being:
* a transformer, then `PwTrafoPanelPipeline([other, self])` is returned
* sklearn transformers are coerced via TabularToSeriesAdaptor
Parameters
----------
other: `sktime` transformer, must inherit from BaseTransformer
otherwise, `NotImplemented` is returned
Returns
-------
PwTrafoPanelPipeline object,
concatenation of `other` (first) with `self` (last).
not nested, contains only non-TransformerPipeline `sktime` steps
"""
from sktime.dists_kernels.compose import PwTrafoPanelPipeline
from sktime.dists_kernels.dummy import ConstantPwTrafoPanel
from sktime.transformations.base import BaseTransformer
from sktime.transformations.compose import TransformerPipeline
from sktime.transformations.series.adapt import TabularToSeriesAdaptor
from sktime.utils.sklearn import is_sklearn_transformer

# when other is an integer or float, treat it as constant distance/kernel
if isinstance(other, (int, float)):
other = ConstantPwTrafoPanel(constant=other)

# behaviour is implemented only if other inherits from BaseTransformer
# in that case, distinctions arise from whether self or other is a pipeline
# todo: this can probably be simplified further with "zero length" pipelines
if isinstance(other, BaseTransformer):
# PwTrafoPanelPipeline already has the dunder method defined
if isinstance(self, PwTrafoPanelPipeline):
return other * self
# if other is a TransformerPipeline but self is not, first unwrap it
elif isinstance(other, TransformerPipeline):
return PwTrafoPanelPipeline(pw_trafo=self, transformers=other.steps)
# if neither self nor other are a pipeline, construct a PwTrafoPanelPipeline
else:
return PwTrafoPanelPipeline(pw_trafo=self, transformers=[other])
elif is_sklearn_transformer(other):
return TabularToSeriesAdaptor(other) * self
else:
return NotImplemented

def __add__(self, other):
"""Magic + method, return (right) added CombinedDistance.
Implemented for `other` being:
* a pairwise panel transformer, then `CombinedDistance([other, self], "+")`
Parameters
----------
other: one of:
* `sktime` transformer, must inherit from BaseTransformer,
otherwise, `NotImplemented` is returned (leads to further dispatch by rmul)
Returns
-------
CombinedDistance object,
algebraic addition of `self` (first) with `other` (last).
not nested, contains only non-CombinedDistance `sktime` transformers
"""
from sktime.dists_kernels.algebra import CombinedDistance
from sktime.dists_kernels.dummy import ConstantPwTrafoPanel

# when other is an integer or float, treat it as constant distance/kernel
if isinstance(other, (int, float)):
other = ConstantPwTrafoPanel(constant=other)

# we wrap self in a CombinedDistance, and concatenate with the other
# the CombinedDistance does the rest, e.g., dispatch on other
if isinstance(other, BasePairwiseTransformerPanel):
if not isinstance(self, CombinedDistance):
self_as_pipeline = CombinedDistance(pw_trafos=[self], operation="+")
else:
self_as_pipeline = self
return self_as_pipeline + other
# otherwise, we let the right operation handle the remaining dispatch
else:
return NotImplemented

def transform(self, X, X2=None):
"""Compute distance/kernel matrix.
Expand Down
71 changes: 71 additions & 0 deletions sktime/dists_kernels/algebra.py
Expand Up @@ -108,6 +108,77 @@ def _pw_trafos(self):
def _pw_trafos(self, value):
self.pw_trafos = value

def _algebra_dunder_concat(self, other, operation):
"""Return (right) concat CombinedDistance, common boilerplate for dunders.
Implemented for `other` being a transformer, otherwise returns `NotImplemented`.
Parameters
----------
other: `sktime` pairwise transformer, must inherit BasePairwiseTransformerPanel
otherwise, `NotImplemented` is returned
operation: operation string used in CombinedDistance for the dunder.
Must be equal to the operation of the dunder, not of self.
Returns
-------
CombinedDistance object, concat of `self` (first) with `other` (last).
does not contain CombinedDistance `sktime` transformers with same operation
(but may nest CombinedDistance with different operations)
"""
if self.operation == operation:
# if other is CombinedDistance but with different operation,
# we need to wrap it, or _dunder_concat would overwrite the operation
if isinstance(other, CombinedDistance) and not other.operation == operation:
other = CombinedDistance([other], operation=operation)
return self._dunder_concat(
other=other,
base_class=BasePairwiseTransformerPanel,
composite_class=CombinedDistance,
attr_name="pw_trafos",
concat_order="left",
composite_params={"operation": operation},
)
elif isinstance(other, BasePairwiseTransformerPanel):
return CombinedDistance([self, other], operation=operation)
else:
return NotImplemented

def __mul__(self, other):
"""Magic * method, return (right) multiplied CombinedDistance.
Implemented for `other` being a transformer, otherwise returns `NotImplemented`.
Parameters
----------
other: `sktime` pairwise transformer, must inherit BasePairwiseTransformerPanel
otherwise, `NotImplemented` is returned
Returns
-------
CombinedDistance object, algebraic * of `self` (first) with `other` (last).
does not contain CombinedDistance `sktime` transformers with same operation
(but may nest CombinedDistance with different operations)
"""
return self._algebra_dunder_concat(other=other, operation="*")

def __add__(self, other):
"""Magic + method, return (right) multiplied CombinedDistance.
Implemented for `other` being a transformer, otherwise returns `NotImplemented`.
Parameters
----------
other: `sktime` pairwise transformer, must inherit BasePairwiseTransformerPanel
otherwise, `NotImplemented` is returned
Returns
-------
CombinedDistance object, algebraic + of `self` (first) with `other` (last).
not nested, contains only non-CombinedDistance `sktime` transformers
"""
return self._algebra_dunder_concat(other=other, operation="+")

def _transform(self, X, X2=None):
"""Compute distance/kernel matrix.
Expand Down

0 comments on commit 65a5304

Please sign in to comment.