Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Improve ShapeletTransformClassifier docstring #3737

Merged
merged 2 commits into from Nov 14, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
81 changes: 46 additions & 35 deletions sktime/classification/shapelet_based/_stc.py
@@ -1,5 +1,5 @@
# -*- coding: utf-8 -*-
"""Shapelet Transform Classifier.
"""A shapelet transform pipeline classifier.

Shapelet transform classifier pipeline that simply performs a (configurable) shapelet
transform then builds (by default) a rotation forest classifier on the output.
Expand All @@ -19,75 +19,86 @@


class ShapeletTransformClassifier(BaseClassifier):
"""Shapelet Transform Classifier.
"""A shapelet transform pipeline classifier.

Implementation of the binary shapelet transform classifier along the lines
of [1]_[2]_. Transforms the data using the configurable shapelet transform and then
builds a rotation forest classifier.
As some implementations and applications contract the classifier solely, contracting
is available for the transform only and both classifier and transform.
of [1][2] but with random shapelet sampling. Transforms the data using the
configurable `RandomShapeletTransform` and then builds a `RotationForest`
classifier.

As some implementations and applications contract the transformation solely,
contracting is available for the transform only and both classifier and transform.

Parameters
----------
n_shapelet_samples : int, default=10000
The number of candidate shapelets to be considered for the final transform.
Filtered down to <= max_shapelets, keeping the shapelets with the most
Filtered down to ``<= max_shapelets``, keeping the shapelets with the most
information gain.
max_shapelets : int or None, default=None
Max number of shapelets to keep for the final transform. Each class value will
have its own max, set to n_classes / max_shapelets. If None uses the min between
10 * n_instances and 1000
have its own max, set to ``n_classes_ / max_shapelets``. If `None`, uses the
minimum between ``10 * n_instances_`` and `1000`.
max_shapelet_length : int or None, default=None
Lower bound on candidate shapelet lengths for the transform.
Lower bound on candidate shapelet lengths for the transform. If ``None``, no
max length is used
estimator : BaseEstimator or None, default=None
Base estimator for the ensemble, can be supplied a sklearn BaseEstimator. If
None a default RotationForest classifier is used.
Base estimator for the ensemble, can be supplied a sklearn `BaseEstimator`. If
`None` a default `RotationForest` classifier is used.
transform_limit_in_minutes : int, default=0
Time contract to limit transform time in minutes for the shapelet transform,
overriding n_shapelets. A value of 0 means n_shapelets is used.
overriding `n_shapelet_samples`. A value of `0` means ``n_shapelet_samples``
is used.
time_limit_in_minutes : int, default=0
Time contract to limit build time in minutes, overriding n_shapelet_samples and
transform_limit_in_minutes. The estimator will only be contracted if a
time_limit_in_minutes parameter is present. Default of 0 means n_estimators or
transform_limit_in_minutes is used.
Time contract to limit build time in minutes, overriding ``n_shapelet_samples``
and ``transform_limit_in_minutes``. The ``estimator`` will only be contracted if
a ``time_limit_in_minutes parameter`` is present. Default of `0` means
``n_shapelet_samples`` or ``transform_limit_in_minutes`` is used.
contract_max_n_shapelet_samples : int, default=np.inf
Max number of shapelets to extract when contracting the transform with
transform_limit_in_minutes or time_limit_in_minutes.
``transform_limit_in_minutes`` or ``time_limit_in_minutes``.
save_transformed_data : bool, default=False
Save the data transformed in fit for use in _get_train_probs.
Save the data transformed in fit in ``transformed_data_`` for use in
``_get_train_probs``.
n_jobs : int, default=1
The number of jobs to run in parallel for both `fit` and `predict`.
``-1`` means using all processors.
The number of jobs to run in parallel for both ``fit`` and ``predict``.
`-1` means using all processors.
batch_size : int or None, default=100
Number of shapelet candidates processed before being merged into the set of best
shapelets in the transform.
random_state : int or None, default=None
Seed for random number generation.
random_state : int, RandomState instance or None, default=None
If `int`, random_state is the seed used by the random number generator;
If `RandomState` instance, random_state is the random number generator;
If `None`, the random number generator is the `RandomState` instance used
by `np.random`.

Attributes
----------
n_classes : int
The number of classes.
classes_ : list
The classes labels.
The unique class labels in the training set.
n_classes_ : int
The number of unique classes in the training set.
fit_time_ : int
The time (in milliseconds) for ``fit`` to run.
n_instances_ : int
The number of train cases.
The number of train cases in the training set.
n_dims_ : int
The number of dimensions per case.
The number of dimensions per case in the training set.
series_length_ : int
The length of each series.
The length of each series in the training set.
transformed_data_ : list of shape (n_estimators) of ndarray
The transformed dataset for all classifiers. Only saved when
save_transformed_data is true.
The transformed training dataset for all classifiers. Only saved when
``save_transformed_data`` is `True`.

See Also
--------
RandomShapeletTransform
RandomShapeletTransform : The randomly sampled shapelet transform.
RotationForest : The default rotation forest classifier used.

Notes
-----
For the Java version, see
`TSML <https://github.com/uea-machine-learning/tsml/blob/master/src/main/
`tsml <https://github.com/uea-machine-learning/tsml/blob/master/src/main/
java/tsml/classifiers/shapelet_based/ShapeletTransformClassifier.java>`_.

References
Expand Down Expand Up @@ -165,7 +176,7 @@ def __init__(
super(ShapeletTransformClassifier, self).__init__()

def _fit(self, X, y):
"""Fit STC to training data.
"""Fit ShapeletTransformClassifier to training data.

Parameters
----------
Expand All @@ -182,7 +193,7 @@ def _fit(self, X, y):
Notes
-----
Changes state by creating a fitted model that updates attributes
ending in "_" and sets is_fitted flag to True.
ending in "_".
"""
self.n_instances_, self.n_dims_, self.series_length_ = X.shape

Expand Down