Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] ensure that all estimators have two test parameter sets #3429

Open
19 of 97 tasks
fkiraly opened this issue Sep 14, 2022 · 23 comments
Open
19 of 97 tasks

[ENH] ensure that all estimators have two test parameter sets #3429

fkiraly opened this issue Sep 14, 2022 · 23 comments
Labels
enhancement Adding new functionality good first issue Good for newcomers maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 14, 2022

We should ensure that all estimators (that have parameters) possess at least two test parameter sets.

The two (or more) parameter sets should:

  • be fast to run together - fit is the bottleneck (so we should not overdo it with too many parameter sets)
  • cover substantially different settings for all the important parameters, i.e., substantially different typical cases and/or important edge cases

Recipe:

  1. search for estimators which have parameters but only a single test parameter set. These are estimators with no get_test_params implemented, or get_test_params returning only a single dictionary instead of a list of two (or more) dictionaries.
  2. post here in this issue which estimator you picked (to avoid duplication of work)
  3. come up with a parameter set satisfying the above constraints and add it to the return (should be list of two or more dictionaries)
  4. make a PR

An example PR that adds second parameter sets for some estimators can be found here: #3428

Finding some estimators that have only one parameter set can also be done speedily by using this PR #2862 which adds a test for two parameter sets - either run the test suite from the branch locally, or look into the failing CI.

Locally running code which does this:

from sktime.registry import all_estimators

all_ests = all_estimators()
[x[0] for x in all_ests if (len(x[1].get_test_params())<2 or isinstance(x[1].get_test_params(), dict)) and len(x[1].get_param_names())>0]

Current output:

  • Aggregator
  • BaggingForecaster
  • CNNNetwork
  • ClaSPTransformer
  • ColumnEnsembleClassifier
  • ColumnTransformer
  • ColumnwiseTransformer
  • ComposableTimeSeriesForestRegressor
  • ConstraintViolation
  • CutoffSplitter
  • DOBIN
  • DWTTransformer
  • DistFromAligner
  • DistanceFeatures
  • DontUpdate
  • DummyRegressor
  • EmpiricalCoverage
  • ExpandingWindowSplitter
  • FCNNetwork
  • FeatureSelection
  • Filter
  • FinancialHolidaysTransformer
  • FittedParamExtractor
  • GeometricMeanAbsoluteError
  • GeometricMeanRelativeAbsoluteError
  • GreedyGaussianSegmentation
  • HCrystalBallAdapter
  • HOG1DTransformer
  • HampelFilter
  • HolidayFeatures
  • InceptionTimeNetwork
  • IndividualBOSS
  • IndividualTDE
  • KNeighborsTimeSeriesRegressor
  • KalmanFilterTransformerFP
  • KalmanFilterTransformerPK
  • LSTMFCNNetwork
  • LogTransformer
  • MACNNNetwork
  • MCDCNNClassifier
  • MCDCNNNetwork
  • MCDCNNRegressor
  • MLPNetwork
  • MatrixProfile
  • MatrixProfileClassifier
  • MatrixProfileTransformer
  • MeanAbsoluteError
  • MeanRelativeAbsoluteError
  • MedianAbsoluteError
  • MedianRelativeAbsoluteError
  • MiniRocket
  • MiniRocketMultivariate
  • MiniRocketMultivariateVariable
  • MultiRocket
  • MultiRocketMultivariate
  • OnlineEnsembleForecaster
  • PAA
  • PCATransformer
  • PaddingTransformer
  • ParamFitterPipeline
  • PlateauFinder
  • PluginParamsForecaster
  • PoissonHMM
  • PyODAnnotator
  • RNNNetwork
  • RandomIntervalFeatureExtractor
  • RandomIntervalSegmenter
  • RandomIntervals
  • RandomSamplesAugmenter
  • ReducerTransform
  • ResNetNetwork
  • Rocket
  • RocketClassifier
  • RocketRegressor
  • STRAY
  • ShapeDTW
  • ShapeletTransform
  • SingleWindowSplitter
  • SlidingWindowSegmenter
  • SlidingWindowSplitter
  • SlopeTransformer
  • StackingForecaster
  • SupervisedTimeSeriesForest
  • TSInterpolator
  • TapNetNetwork
  • TestPlusTrainSplitter
  • ThetaLinesTransformer
  • ThetaModularForecaster
  • TimeBinner
  • TimeSeriesForestClassifier
  • TimeSeriesForestRegressor
  • TimeSeriesKMeansTslearn
  • TimeSeriesLloyds
  • TruncationTransformer
  • UnobservedComponents
  • WhiteNoiseAugmenter
  • YtoX
@fkiraly fkiraly added good first issue Good for newcomers maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests enhancement Adding new functionality labels Sep 14, 2022
fkiraly added a commit that referenced this issue Sep 21, 2022
Towards #3429. This adds a second parameter set for all estimators checked via `check_estimator` in the `no-softdeps` CI element.

This is generally useful, and also allows #2862 to pass that CI element.

Also fixes a bug discovered through this:
`ExponentTransformer.inverse_transform` breaking if `power` is close to zero. This is now dealt with by a skip and a warning.
@Abelarm
Copy link
Contributor

Abelarm commented Oct 1, 2022

Hi @fkiraly

I’d like to to tackle this if it’s ok for you.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Oct 1, 2022

sure, @Abelarm - pick an estimator!

fkiraly added a commit that referenced this issue Dec 22, 2022
…gressorPipeline` (#3857)

`set_params` bug in `ClassifierPipeline` and `RegressorPipeline` was broken, it would not correctly update parameters.

The failure to detect this is an instance of the known problem #3429

The bug has been fixed, and is now covered by appropriate tests (addition of a second parameter set).

The fix is as follow:

* the issue was in `set_params` which accidentally had one too many layer of nesting in the param dict indexing, e.g., `classifier__` etc. This would materialize only for doubly nested estimators.
* this was fixed, with a concomitant extension of the dict subset utility in the `_HetereogenousMetaEstimator`.

Depends on #3858 as the `DummyClassifier` is used as one of two param sets.
fkiraly added a commit that referenced this issue Jan 11, 2023
…ts per estimator to 2 or larger (#4043)

This PR adds test parameter sets to some estimators which have only one, towards issue #3429.

This ensures that the test suite can properly test set and reset of parameters, and increases test coverage by ensuring that more parameters have more than one value setting in the tests.

Test that detects estimators with only one parameter set:
#2862 (related, not a dep)

Depends on fixes of bugs detected through the new parameter sets:
* #4047
* #4049
* #4057
klam-data pushed a commit to CodeSmithDSMLProjects/sktime that referenced this issue Jan 18, 2023
…ts per estimator to 2 or larger (sktime#4043)

This PR adds test parameter sets to some estimators which have only one, towards issue sktime#3429.

This ensures that the test suite can properly test set and reset of parameters, and increases test coverage by ensuring that more parameters have more than one value setting in the tests.

Test that detects estimators with only one parameter set:
sktime#2862 (related, not a dep)

Depends on fixes of bugs detected through the new parameter sets:
* sktime#4047
* sktime#4049
* sktime#4057
fkiraly added a commit that referenced this issue Jan 23, 2023
Second test parameter set for `ARIMA`, towards #3429.

Split off from #2862 where it must have ended up accidentally.
fkiraly added a commit that referenced this issue Feb 28, 2023
Towards #3429, test parameter sets for performance metrics.
fkiraly added a commit that referenced this issue May 17, 2023
This adds a second test parameter set to `AutoETS`.

Towards #3429

Related: #4587, as the second set
has `auto=True`
@janpipek
Copy link
Contributor

I am picking SARIMAX.

@julia-kraus
Copy link
Contributor

which estimators are still left?

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 24, 2023

@julia-kraus, the failures in this diagnostic PR #2862 correspond to the ones that do have only one - it might not be 100% up to date, I'll restart it so it is:

@namita0210
Copy link

Hi , can I take up this issue for the estimator: "TimeSeriesForestClassier"
@fkiraly

@fkiraly
Copy link
Collaborator Author

fkiraly commented Mar 18, 2024

@namita0210, absolutely! All yours!

fkiraly pushed a commit that referenced this issue Mar 19, 2024
#### What does this implement/fix? Explain your changes.
<!--
A clear and concise description of what you have implemented.
-->

Implemented the standard 'get_test_params' class method with the
appropriate docstring and applicable parameters.

Added a couple test params for `RNNNetwork` contributing towards #3429.
One test param that covers the default set and another that covers the
'units' parameter.
fkiraly pushed a commit that referenced this issue Mar 21, 2024
Towards #3429

Adds a second test parameter set for shapeDTW
@MMTrooper
Copy link
Contributor

Hi, I will try to tackle the MatrixProfileClassifier. @fkiraly

@fkiraly
Copy link
Collaborator Author

fkiraly commented Mar 21, 2024

great, thanks, @MMTrooper!

@shankariraja
Copy link
Contributor

Hi @fkiraly,

I'm currently working on adding new test parameter sets for the estimators identified in the issue. I'll be focusing on TimeSeriesKMeansTslearn.
I'll create a pull request once I've completed the changes and tests. In the meantime, please let me know if you have any suggestions.

Thanks!

fkiraly pushed a commit that referenced this issue Mar 23, 2024
)

- Introduce two test parameter sets for ``TimeSeriesKMeansTslearn`` in
the ``get_test_params`` function.

- Reference Issues
  Towards : #3429 

- Tests passed: pytest sktime\clustering\tests\test_k_means.py
@KaustubhUp025
Copy link
Contributor

KaustubhUp025 commented Mar 23, 2024

Hello @fkiraly , I will try to work on the estimator:- LogTransformer

@Z-Fran
Copy link
Contributor

Z-Fran commented Mar 25, 2024

Hi @fkiraly , I will try to work on KNeighborsTimeSeriesRegressor.

fkiraly added a commit that referenced this issue Mar 27, 2024
This PR enforces a stricter condition on `get_test_params`, namely that
it should always run, even if all sensible instances require soft
dependencies.

This is to make the inspection contracts simpler and unconditional as
regards dependencies.

Two instances where this has recently caused problems is the
`TemporianTransformer` in #5956 (FYI @ianspektor, @achoum, @javiber),
and the #5880 (FYI @benHeid, @astrogilda).

Having breaking `get_test_params` will also prevent the code snippet in
the entry issue #3429 from
running, which is causing problems from new contributors, as that issue
is presented as a simple entry task. The code snippet is now covered by
guaranteeing that `get_test_params` always runs.

Includes: fix for `TSBootstrapAdapter`, which was the only non-compliant
estimator.
fkiraly pushed a commit that referenced this issue Apr 2, 2024
… 3429 (#6209)

#### What does this implement/fix? Explain your changes.

Implemented get_test_params for both ```CNNNetwork``` and
```ResnetNetwork``` for issue #3429 . Fixed a couple typos inside the
docstring for ```CNNNetwork```, changed from nb_conv_layers to
n_conv_layers.
fkiraly pushed a commit that referenced this issue Apr 3, 2024
… and non-precomputed mode to improve memory efficiency (#6217)

#### Reference Issues/PRs
#3429 (comment)

#### What does this implement/fix? Explain your changes.
adds test parameter sets for `KNeighborsTimeSeriesRegressor`;

adds support for non-brute algorithms and non-precomputed mode, mirroring #5937
@shlok191
Copy link
Contributor

shlok191 commented Apr 5, 2024

Hello @fkiraly,
Could I please work on the LSTMFCNNetwork?

Thank you!!

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 5, 2024

do you mean, you would like to work on LSTMFCNNetwork, or are you asking me to?

@shlok191
Copy link
Contributor

shlok191 commented Apr 5, 2024

@fkiraly, Oh I'm sorry about that! I meant if I could work on this! I wrote that while traveling, so sorry again!

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 5, 2024

Sure! No worries, and thanks for contributing!

For this estimator, kindly be aware of:

@shlok191
Copy link
Contributor

shlok191 commented Apr 5, 2024

@fkiraly, Thank you so much for letting me help out!
I'll keep my eye on the possible failures and I'll remove the test skips as well :)

fkiraly pushed a commit that referenced this issue Apr 8, 2024
#### What does this implement/fix? Explain your changes.

Towards #3429

I decided to add the estimator parameter and set it to the scikit-learn
classifier `KNeighborsClassifier`.

I also added my self as a contributor. Let me know if it was appropriate
or I need to make a better implementation.
@shlok191
Copy link
Contributor

shlok191 commented Apr 9, 2024

Hello @fkiraly,

I hope that you're having a good start to your week! I wanted to let you know that I added 2 test parameters for the LSTMFCNNetwork here. I also checked tests/_config.py to make sure that this estimator is included in CI tests. ☺️

I am really excited to test out the test parameters and getting your feedback! I can try to test out the changes locally first if that is the preferred protocol. I learned a lot about LSTMs in the context of time series from this. I would really love to possibly contribute more after this estimator's parameters are completed if that is okay.

Thank you so much again!

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 9, 2024

Great, @shlok191!

I recommend you open a pull request, where core developers can discuss your contribution further and possiby merge it!

@shlok191
Copy link
Contributor

shlok191 commented Apr 9, 2024

@fkiraly, I just added a PR. Thanks a lot again for letting me contribute! 😃

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 9, 2024

sktime is an open project, so everyone can contribute!

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality good first issue Good for newcomers maintenance Continuous integration, unit testing & package distribution module:tests test framework functionality - only framework, excl specific tests
Projects
None yet
Development

No branches or pull requests