Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

&fkiraly [BUG] fix sklearn interface non-conformance for estimators in _proximity_forest.py, add further test parameter sets #3520

Merged
merged 10 commits into from Aug 6, 2023

Conversation

Abelarm
Copy link
Contributor

@Abelarm Abelarm commented Oct 2, 2022

Reference Issues/PRs

#3429

What does this implement/fix? Explain your changes.

Added a second set of params to the testing method get_test_params, changed the signature of the method for the class ProximityStump to reflect the BaseClassifier

This unveiled an unreported bug which is also fixed:
the distance_measure param was overwritten whenever it was None.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Can you kindly make sure to:

  • check runtimes with check_estimator and report these
  • ensure that more parameters are covered (logical parameters)
  • that the values for parameters differ across the two scenarios

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 2, 2022

also, your linting is failing - kindly check your linters
here's a tutorial on, let us know if you have any questions!
https://www.sktime.org/en/stable/developer_guide/coding_standards.html
(if you're not sure, you can join one of the Fri sessions to get help with your set-up)

@Abelarm Abelarm force-pushed the Adding_test_cases_to_proximity_forest branch from b970962 to 1d7a2e7 Compare October 2, 2022 19:55
@Abelarm
Copy link
Contributor Author

Abelarm commented Oct 2, 2022

  1. Check with check_estimator

For ProximityStump

FAILED: test_fit_does_not_overwrite_hyper_params[ProximityStump-0-ClassifierFitPredict]
FAILED: test_fit_idempotent[ProximityStump-1-ClassifierFitPredict-predict]
FAILED: test_fit_idempotent[ProximityStump-1-ClassifierFitPredict-predict_proba]
FAILED: test_methods_do_not_change_state[ProximityStump-1-ClassifierFitPredict-predict]
FAILED: test_methods_do_not_change_state[ProximityStump-1-ClassifierFitPredict-predict_proba]
FAILED: test_methods_have_no_side_effects[ProximityStump-1-ClassifierFitPredict-predict]
FAILED: test_methods_have_no_side_effects[ProximityStump-1-ClassifierFitPredict-predict_proba]
FAILED: test_multiprocessing_idempotent[ProximityStump-1-ClassifierFitPredict-predict]
FAILED: test_multiprocessing_idempotent[ProximityStump-1-ClassifierFitPredict-predict_proba]
FAILED: test_persistence_via_pickle[ProximityStump-0-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityStump-0-ClassifierFitPredict-predict_proba]
FAILED: test_persistence_via_pickle[ProximityStump-1-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityStump-1-ClassifierFitPredict-predict_proba]
FAILED: test_classifier_output[ProximityStump-1-ClassifierFitPredict]

For ProximityTree

FAILED: test_set_params_sklearn[ProximityTree]
FAILED: test_fit_does_not_overwrite_hyper_params[ProximityTree-0-ClassifierFitPredict]
FAILED: test_fit_does_not_overwrite_hyper_params[ProximityTree-1-ClassifierFitPredict]
FAILED: test_multiprocessing_idempotent[ProximityTree-0-ClassifierFitPredict-predict_proba]
FAILED: test_multiprocessing_idempotent[ProximityTree-1-ClassifierFitPredict-predict_proba]
FAILED: test_persistence_via_pickle[ProximityTree-0-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityTree-0-ClassifierFitPredict-predict_proba]
FAILED: test_persistence_via_pickle[ProximityTree-1-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityTree-1-ClassifierFitPredict-predict_proba]

For ProximityForest

FAILED: test_fit_does_not_overwrite_hyper_params[ProximityForest-0-ClassifierFitPredict]
FAILED: test_persistence_via_pickle[ProximityForest-0-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityForest-0-ClassifierFitPredict-predict_proba]
FAILED: test_persistence_via_pickle[ProximityForest-1-ClassifierFitPredict-predict]
FAILED: test_persistence_via_pickle[ProximityForest-1-ClassifierFitPredict-predict_proba]
  1. What would you suggest? for the doc the only thing that could have changed was the distance, maybe for the forest I can change it a bit more but for the other two?

  2. Updated

  3. (code-style) installed the pre-commit now everything should be ok

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 2, 2022

"test_persistence_via_pickle" and "test_fit_does_not_overwrite_hyper_params" are known failures of the estimators, see sktime.tests._config where they are skipped.

The other failing tests appear to be genuine failures.

What surprises me is that the first parameter set (the one previously used) leads to failure - according to the remote CI, it should not fail?

I will start remote CI and see what is going on, possibly we have found some actual bugs here.

@fkiraly fkiraly added the module:classification classification module: time series classification label Oct 2, 2022
@Abelarm
Copy link
Contributor Author

Abelarm commented Oct 3, 2022

So @fkiraly looks like several errors are raised by the file I modified, I tried to look at them and I am a bit lost :(

@fkiraly fkiraly requested a review from aiwalter as a code owner October 8, 2022 11:20
@fkiraly
Copy link
Collaborator

fkiraly commented Oct 8, 2022

@Abelarm, I've looked at it and thing it is one of two things:

  1. the expected types and values for parameters are not properly documented, and something else is expected for distance_measure (not a string, or a different string). I opened an issue here: [DOC] parameters of proximity forests and trees are not properly documented #3545
  2. genuine bugs

You have done all as would be expected, great!

What you could do next:

  • look into the code whether you can reverse engineer what parameters are expected. This might be tedious and not very fun, so you may not want to do this.
  • look at different estimators that are better documented.

@Abelarm
Copy link
Contributor Author

Abelarm commented Oct 8, 2022

Since I am already "familiar" with this part of code I'll try to check if I am able to understand what's going on 🚀

@fkiraly fkiraly changed the title [ENH] Added second param set for the estimators in _proximity_forest.py &fkiraly [ENH] Added second param set for the estimators in _proximity_forest.py Aug 5, 2023
@fkiraly fkiraly added the bugfix Fixes a known bug or removes unintended behavior label Aug 5, 2023
@fkiraly fkiraly changed the title &fkiraly [ENH] Added second param set for the estimators in _proximity_forest.py &fkiraly [BUG] fix sklearn interface non-conformance for estimators in _proximity_forest.py, add further test parameter sets Aug 5, 2023
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the outstanding interface issues.
Actionables re problematic parameters are moved here: #5042

@fkiraly fkiraly merged commit 7525ea5 into sktime:main Aug 6, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a known bug or removes unintended behavior module:classification classification module: time series classification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants