Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] should we have a reset method? #1614

Closed
fkiraly opened this issue Nov 12, 2021 · 3 comments · Fixed by #2531
Closed

[ENH] should we have a reset method? #1614

fkiraly opened this issue Nov 12, 2021 · 3 comments · Fixed by #2531
Labels
API design API design & software architecture implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 12, 2021

Should sktime estimators have a reset method that sets the estimator's entire state to what it would be after initialization?

The main reason is that this would be a clean way to reset the estimator at the beginning of fit whenever called for the second or any repeated time.

This would prevent bugs such as in #1595 which can lead to unexpected and hard-to-debug behaviour in grid search etc, where fit might be called multiple times.

Of course, a "clean" implementation would use clone, but not all contributors/extenders are aware of the convention.

A more implementer friendly logic is the described behaviour, where fit resets the estimator when called a second time - this is "like" a call to sklearn clone, but it requires no proactivity and advanced knowledge of sklearn internals.

FYI @mloning, @aiwalter, @ltsaprounis, @danbartl, @TonyBagnall, opinions?

@fkiraly fkiraly added implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality API design API design & software architecture labels Nov 12, 2021
@ltsaprounis
Copy link
Contributor

@fkiraly - Should we also raise another issue to add clone in all Cross Validation and Grid Search classes/functions?

@fkiraly
Copy link
Collaborator Author

fkiraly commented Nov 15, 2021

@ltsaprounis, excellent idea! Could you kindly open this and point to the instances where it's not done?

@ltsaprounis
Copy link
Contributor

Will do shortly, need to find where we need to do it and make a checklist.

@fkiraly fkiraly linked a pull request Apr 22, 2022 that will close this issue
fkiraly added a commit that referenced this issue Apr 27, 2022
Provides the functionality discussed in #1614, i.e., a `reset` method in `BaseObject` that resets attributes and internal state of an estimator/object to its post-`__init__` state (while keeping parameters), equivalent to overwriting `self` with a `sklearn.clone`.

This is motivated by the bug in `MultiplexForecaster` which @miraep8's test suite uncovered in combination with #2458, and the observation that:
* boilerplate needs to be copied from `__init__` to `_fit` to fix it, and
* that the same issue possibly exists in a number of other composites which are not tested for subsequent `fit` with different parameters.

This PR also adds `reset` at the start of `BaseForecaster.fit`, addressing a family of potential bugs where not initializing in `_fit` causes unexpected behaviour in a sequence of calls `__init__`, `set_params`, `fit` (the bug present in the original #2458).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API design API design & software architecture implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants