Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] remove private methods from parameters of ProximityForest, ProximityTree, and ProximityStump #6046

Merged
merged 26 commits into from Mar 15, 2024

Conversation

fnhirwa
Copy link
Contributor

@fnhirwa fnhirwa commented Mar 2, 2024

Reference Issues/PRs

Fixes #5042

What does this implement/fix? Explain your changes.

Removed methods that were being instantiated with the objects of ProximityForest, ProximityTree, and ProximityStump classes and made them the class methods.

For the get_gain method which defaults to gini_gain I added a dictionary to track the gain functions in case there is more gain functions added in the future.

Methods changed

  • setup_distance_measure
  • get_exemplars
  • get_distance_measure
  • find_stump
  • get_gain

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Whether the refactor logic is well versed

Did you add any tests for the change?

No tests added

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • Optionally, I've added myself and possibly others to the CODEOWNERS file - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

@fnhirwa fnhirwa marked this pull request as ready for review March 2, 2024 06:55
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

I was wondering, there are a lot of functions which just dispatch to a function. Should these be moved into the class, or otherwise simplified? (to discuss, not necessarily to action)

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 2, 2024

The docstring example of ProximityStump seems to fail.

You can also check the estimator by running tests using sktime.utils.check_estimator.

@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 2, 2024

Nice, thanks!

I was wondering, there are a lot of functions which just dispatch to a function. Should these be moved into the class, or otherwise simplified? (to discuss, not necessarily to action)

Well, I think moving these functions to classes would make code more cohesive and enforce readability and maintainability even if it will be achieved at the cost of duplicating some functions here 🙄

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 2, 2024

Well, I think moving these functions to classes would make code more cohesive and enforce readability and maintainability even if it will be achieved at the cost of duplicating some functions here

What about moving the entire body to the class then?

@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 2, 2024

Well, I think moving these functions to classes would make code more cohesive and enforce readability and maintainability even if it will be achieved at the cost of duplicating some functions here

What about moving the entire body to the class then?

What about moving the entire body to the class then?

Yeah if you agree with it I can start away with it on this PR

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 2, 2024

if you agree that´s a good idea, of course. If not, please explain and advise.

…mityStump`

moved all function logic to the body and simplified the logic of the functions that were dispatching to other function. Removed redundant logics and tested locally tests are passing
@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 2, 2024

if you agree that´s a good idea, of course. If not, please explain and advise.

I agree with it and working on it will push a commit that adds these changes

@fnhirwa fnhirwa requested a review from fkiraly March 2, 2024 18:29
fnhirwa and others added 8 commits March 3, 2024 09:32
Closes sktime#6035 

This PR tries to show more reliable coverage badge on README using
carryforward flag feature for `complete`.

Ref. https://docs.codecov.com/docs/carryforward-flags
Reverts sktime#6041

After the merge, the badge now appears broken and the URL showed no
image.
sktime#6038)

#### Reference Issues/PRs
Part of sktime#3351. See also
sktime#3365

#### What does this implement/fix? Explain your changes.
Migrated CNTC, InceptionTime, and MACNN regressors to sktime from
sktime-dl
Fixes various minor typos:

* docstring typos
* author credits: `model_evaluation` was missing credits of
@hazrulakmal; module `__init__` should have no authors variable
fkiraly and others added 2 commits March 5, 2024 01:21
…ion templates (sktime#6053)

This PR adds further clarification regarding immutability of
`self`-params in the extension templates, including recipes on handling
defaults and estimators (the latter only in "non-simple" extension
templates).
Sktime already has an adapter for `neuralforecast` which only implements
`RNN` model. RNN's suffer from problems like gradient vanishing while
training, So I it's good to have LSTM model implemented that works
better than RNN for time-series forecasting.

Here is a demo of LSTM implemented in sktime using neuralforecast
[notebook](https://colab.research.google.com/drive/1R9z5HS4uRpzVuVYKMHoB-q3-osWgHyD_?usp=sharing)

---------

Co-authored-by: Anirban Ray <39331844+yarnabrina@users.noreply.github.com>
@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 6, 2024

Hey @fkiraly
I finished refactoring the code and majority of tests are passing but 3 CI runs are failing raising an error
ValueError: Input contains NaN. is this related to my implementation or should I look at the unit tests?
Thanks

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 6, 2024

@fnhirwa, the failures are related to the classes you changed, and they are not present on main, which strongly suggests they are caused by your changes.

My gut feeling would be that there might be a small mistake somewhere, e.g., a default value that is different from prior state.

I would advise that you isolate a piece of self-contained code that replicates the error, perhaps a few lines; perhaps using check_estimator. Then, compare traceback of runs on main (without your PR) and on your PR, until you've found a difference (probably a function call where args differ).
Visual studio code debugger or step-wise debug execution would help a lot.

Let us know if you need help, e.g., a demo of how this kind of debugging works.

@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 6, 2024

@fnhirwa, the failures are related to the classes you changed, and they are not present on main, which strongly suggests they are caused by your changes.

My gut feeling would be that there might be a small mistake somewhere, e.g., a default value that is different from prior state.

I would advise that you isolate a piece of self-contained code that replicates the error, perhaps a few lines; perhaps using check_estimator. Then, compare traceback of runs on main (without your PR) and on your PR, until you've found a difference (probably a function call where args differ). Visual studio code debugger or step-wise debug execution would help a lot.

Let us know if you need help, e.g., a demo of how this kind of debugging works.

I completely understand that and going to deep dive to find the issue and will push a fix soon🤞

@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 11, 2024

@fkiraly the CI tests seems to pass now would like a review from you.
Thanks!

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 11, 2024

Sure! Will start remote CI and then review.

@fkiraly fkiraly added module:classification classification module: time series classification enhancement Adding new functionality labels Mar 11, 2024
fkiraly
fkiraly previously approved these changes Mar 11, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Refactor points seem correctly moved.

Left some comments regarding docstrings, these are not blocking though.

Test failures are unrelated, see #6094

@fnhirwa
Copy link
Contributor Author

fnhirwa commented Mar 12, 2024

Looks good to me! Refactor points seem correctly moved.

Left some comments regarding docstrings, these are not blocking though.

Test failures are unrelated, see #6094

Just made the requested changes to docstrings and made the gini_gain a default gain function, with the direction of the line to change in case a custom gain function is needed.
Thanks

@fnhirwa fnhirwa requested a review from fkiraly March 12, 2024 03:15
fkiraly
fkiraly previously approved these changes Mar 12, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

fkiraly
fkiraly previously approved these changes Mar 14, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still approve)

@fkiraly fkiraly merged commit e362733 into sktime:main Mar 15, 2024
50 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:classification classification module: time series classification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] remove private methods from parameters of ProximityForest, ProximityTree, ProximityStump
5 participants