Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] speed up BaseTransformer checks and conversion boilerplate #5036

Merged
merged 9 commits into from Aug 11, 2023

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Aug 5, 2023

This PR speeds up BaseTransformer checks and conversion boilerplate:

  • replacing some convert_to calls by more specific convert calls that ensure we do not repeat mtype checks
  • specifying explicitly the needed metadata in an instance where all metadata was returned, to avoid costly metadata queries that are discarded
  • refactoring convert and convert_to, moving logic that determines the target mtype into a separate function, and allowing convert to have list as to_type
  • adding an option to convert_to_scitype to return the mtype of the converted-to output. This is passed as an input to convert in the BaseTransformer boilerplate to avoid checks to obtain this information.

@fkiraly fkiraly added implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing enhancement Adding new functionality labels Aug 5, 2023
@fkiraly fkiraly merged commit 0094f86 into main Aug 11, 2023
24 checks passed
@fkiraly fkiraly deleted the speed-trafo-boilerplate branch August 11, 2023 11:41
fkiraly added a commit that referenced this pull request Aug 14, 2023
This PR speeds up the parameter fitter base class boilerplate by
avoiding to carry out mtype checks twice.

This is achieved by replacing one instance of `convert_to` (which checks
the mtype, redundant to `check_is_mtype`), by an equivalent instance of
`convert`, using the mtype output of `check_is_mtype`.

Related:
#5056
#5036

Depends on #5036 due to the changes
to `convert` being common to both speed-ups.
fkiraly added a commit that referenced this pull request Aug 19, 2023
This PR speeds up the checks and conversions boilerplate in
`BaseSplitter`.

* avoids unnecessary checks to detect mtype, by replacing `convert_to`
by a more specific `convert`
* prevents unnecessary metadata computation in `check_is_scitype` by
adjusting the requested metadata argument to none

To effect this, adds an argument and logic to `convert` and `convert_to`
that allows to return the mtype of the object that was returned - this
is not immediate from the arguments, as the `to_type` can be an
allow-list.

Related, and depends on (for changes to `convert`):
#5036
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant