[BUG] Fix BOSS based classifiers truncating class names to single character length #4096

erjieyong · 2023-01-11T10:56:40Z

Reference Issues/PRs

This will fix #4090 which prevent predicted classes list from being populated correctly

What does this implement/fix? Explain your changes.

When calling np.zeros, change string datatype to object so that np.zeros will not truncate the string to only the first character.

I've also tested this locally using provided example in the issues as well as running check_estimator(IndividualBOSS). All tests PASSED!

Does your contribution introduce a new dependency? If yes, which one?

No

.all-contributorsrc

fkiraly

Looks good to me - could you kindly add a test that certifies for the fix?

I.e., your example (as simple as possible, with dummy data) that fails before the fix and runs after?

It should go into the folder sktime.classification.dictionary_based, in a new module test_boss.py

Let me know if you don't want to add that and/or need help with pytest, best starting point is perhaps following the pattern in test_tde (same folder)

erjieyong · 2023-01-11T13:40:19Z

Sure! Thanks for guiding me through this. Happy to learn new things! I will try to add the unit test and submit again.

…

Message ID: ***@***.***>

erjieyong · 2023-01-12T02:11:15Z

@fkiraly ,I need your help. How do I run the tests after writing them?

Using test_tde.py as an example, when I run the following code: the verbose output does not seem show that the tests in test_tde has been excuted. (eg. test_tde_train_estimate)

from sktime.utils.estimator_checks import check_estimator
from sktime.classification.dictionary_based._tde import TemporalDictionaryEnsemble
check_estimator(TemporalDictionaryEnsemble)

Output as follows

All tests PASSED!
{'test_clone[TemporalDictionaryEnsemble]': 'PASSED',
 'test_constructor[TemporalDictionaryEnsemble]': 'PASSED',
 'test_create_test_instance[TemporalDictionaryEnsemble]': 'PASSED',
 'test_create_test_instances_and_names[TemporalDictionaryEnsemble]': 'PASSED',
 'test_estimator_tags[TemporalDictionaryEnsemble]': 'PASSED',
 'test_get_params[TemporalDictionaryEnsemble]': 'PASSED',
 'test_has_common_interface[TemporalDictionaryEnsemble]': 'PASSED',
 'test_inheritance[TemporalDictionaryEnsemble]': 'PASSED',
 'test_no_between_test_case_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredict-0]': 'PASSED',
 'test_no_between_test_case_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredict-1]': 'PASSED',
 'test_no_between_test_case_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-0]': 'PASSED',
 'test_no_between_test_case_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-1]': 'PASSED',
 'test_no_cross_test_side_effects_part1[TemporalDictionaryEnsemble]': 'PASSED',
 'test_no_cross_test_side_effects_part2[TemporalDictionaryEnsemble]': 'PASSED',
 'test_repr[TemporalDictionaryEnsemble]': 'PASSED',
 'test_set_params[TemporalDictionaryEnsemble]': 'PASSED',
 'test_set_params_sklearn[TemporalDictionaryEnsemble]': 'PASSED',
 'test_valid_estimator_class_tags[TemporalDictionaryEnsemble]': 'PASSED',
 'test_valid_estimator_tags[TemporalDictionaryEnsemble]': 'PASSED',
 'test_fit_does_not_overwrite_hyper_params[TemporalDictionaryEnsemble-ClassifierFitPredict]': 'PASSED',
 'test_fit_does_not_overwrite_hyper_params[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate]': 'PASSED',
 'test_fit_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_fit_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_fit_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_fit_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_fit_returns_self[TemporalDictionaryEnsemble-ClassifierFitPredict]': 'PASSED',
 'test_fit_returns_self[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate]': 'PASSED',
 'test_fit_updates_state[TemporalDictionaryEnsemble-ClassifierFitPredict]': 'PASSED',
 'test_fit_updates_state[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredict-get_fitted_params]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_methods_have_no_side_effects[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-get_fitted_params]': 'PASSED',
 'test_multiprocessing_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_multiprocessing_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_multiprocessing_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_multiprocessing_idempotent[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredict-get_fitted_params]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_non_state_changing_method_contract[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-get_fitted_params]': 'PASSED',
 'test_persistence_via_pickle[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_persistence_via_pickle[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_persistence_via_pickle[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_persistence_via_pickle[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredict-get_fitted_params]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_raises_not_fitted_error[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-get_fitted_params]': 'PASSED',
 'test_save_estimators_to_file[TemporalDictionaryEnsemble-ClassifierFitPredict-predict]': 'PASSED',
 'test_save_estimators_to_file[TemporalDictionaryEnsemble-ClassifierFitPredict-predict_proba]': 'PASSED',
 'test_save_estimators_to_file[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict]': 'PASSED',
 'test_save_estimators_to_file[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate-predict_proba]': 'PASSED',
 'test_classifier_on_basic_motions[TemporalDictionaryEnsemble]': 'PASSED',
 'test_classifier_on_unit_test_data[TemporalDictionaryEnsemble]': 'PASSED',
 'test_classifier_output[TemporalDictionaryEnsemble-ClassifierFitPredict]': 'PASSED',
 'test_classifier_output[TemporalDictionaryEnsemble-ClassifierFitPredictMultivariate]': 'PASSED',
 'test_handles_single_class[TemporalDictionaryEnsemble]': 'PASSED',
 'test_multivariate_input_exception[TemporalDictionaryEnsemble]': 'PASSED'}

Appreciate your advice. Thanks.

fkiraly · 2023-01-12T06:38:55Z

check_estimator only runs the tests that come from TestAllEstimators and TestAllClassifiers etc.

To run the test in, say, test_tde or the new test_boss, you need to use pytest directly.

Two common options are:

running the test from your developer IDE, here's the tutorial for visual studio code: https://code.visualstudio.com/docs/python/testing
running pytest via the command line, specifying the module, see here: https://stackoverflow.com/questions/36456920/is-there-a-way-to-specify-which-pytest-tests-to-run-from-a-file

fkiraly

Thanks, looks great!
Will merge once tests pass.

One tipp, but not a blocker - the tests are a bit repetitive, and they could be made much less copy-paste by using pytest mark.parametrize (on pairs of the expected y_pred.dtype, and the new_class dict values).

Feel free to try changing that (you can do it in this PR or a separate one once this is merged), as said it´s not necessary for this to go in imo.

fkiraly · 2023-01-12T09:48:53Z

Ah, but linting is failing now :-)
Here´s the guide: https://www.sktime.org/en/stable/developer_guide/coding_standards.html

fkiraly · 2023-01-12T10:10:03Z

just fixed the linting so we can potentially put this fix in the release

erjieyong · 2023-01-12T10:12:19Z

I see there's another issue. Let me clear it on my end first. I have actually installed Black locally, but I guess it didn't catch the particular portion which failed. Will try and install pre-commit

fkiraly · 2023-01-12T11:14:54Z

I see there's another issue. Let me clear it on my end first.

Yes, it wants you to write docstrings for the tests. Minimal ones have one line, better ones have an explanation of what precisely the test success and fail conditions are.

…ix-boss-classes

erjieyong · 2023-01-12T11:38:31Z

@fkiraly, I've managed to install pre-commit locally already and it has passed all the test. Please review again. Thank you

fkiraly · 2023-01-12T13:45:15Z

linting passes now! 🎉

erjieyong · 2023-01-13T00:04:03Z

Thanks a lot for your guidance. I'll work on improving the tests soon!

…racter length (sktime#4096) Fixes sktime#4090 which prevents predicted classes list from being populated correctly When calling `np.zeros`, change `string` datatype to `object` so that `np.zeros` will not truncate the string to only the first character. I've also tested this locally using provided example in the issues as well as running `check_estimator(IndividualBOSS)`. All tests PASSED!

ER JIE YONG added 2 commits January 11, 2023 16:36

Allow BOSS to accept multi char string as classes

e91914b

add erjieyong as contributor

7995613

erjieyong requested a review from fkiraly as a code owner January 11, 2023 10:56

fkiraly changed the title ~~[BUG] Fix boss classes~~ [BUG] Fix BOSS based classifiers truncating class names to single character length Jan 11, 2023

fkiraly added module:classification classification module: time series classification bugfix Fixes a known bug or removes unintended behavior labels Jan 11, 2023

fkiraly reviewed Jan 11, 2023

View reviewed changes

.all-contributorsrc Outdated Show resolved Hide resolved

fkiraly requested changes Jan 11, 2023

View reviewed changes

added test cases and change validation dtype

9255da1

erjieyong requested a review from fkiraly January 12, 2023 08:53

fkiraly previously approved these changes Jan 12, 2023

View reviewed changes

linting

845cec4

fkiraly dismissed their stale review via 845cec4 January 12, 2023 10:09

linting

67ebc42

erjieyong added 2 commits January 12, 2023 19:34

change import sequence of Individual Boss and Boss Ensemble

9eebf72

Merge branch 'fix-boss-classes' of github.com:erjieyong/sktime into f…

f071396

…ix-boss-classes

fkiraly merged commit 7fcce15 into sktime:main Jan 12, 2023

erjieyong deleted the fix-boss-classes branch January 13, 2023 09:21

erjieyong mentioned this pull request Jan 13, 2023

[ENH] Reduce repetitive code in test_boss.py and add check for string datatype in _boss.py #4100

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Fix BOSS based classifiers truncating class names to single character length #4096

[BUG] Fix BOSS based classifiers truncating class names to single character length #4096

erjieyong commented Jan 11, 2023

fkiraly left a comment

erjieyong commented Jan 11, 2023 via email

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

fkiraly left a comment

fkiraly commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 13, 2023

[BUG] Fix BOSS based classifiers truncating class names to single character length #4096

[BUG] Fix BOSS based classifiers truncating class names to single character length #4096

Conversation

erjieyong commented Jan 11, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

fkiraly left a comment

Choose a reason for hiding this comment

erjieyong commented Jan 11, 2023 via email

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 12, 2023

fkiraly commented Jan 12, 2023

erjieyong commented Jan 13, 2023