Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn #109

Merged
merged 5 commits into from Oct 26, 2016

Conversation

@chkoar
Copy link
Member

@chkoar chkoar commented Jul 25, 2016

For consistency reasons I think that we should follow scikit-learn conventions in naming the parameters.
I propose to change the size_ngh parameter to n_neighbors. Unfortunately, this change will have impact in the public API. It is an early modification but it will break users code. I don't know if we could merge this change without a deprecation warning.

@dvro
Copy link
Member

@dvro dvro commented Jul 25, 2016

@chkoar we could leave size_ngh and include a warning, allowing n_neighbors in a kwargs.

IMO, the user would pass either the n_neighbors parameter
but we also could allow someone to pass the NearestNeighbors object so that they could
use customized decision metrics, radius, n_neighbors, variable n_neighbors based on cluster
analysis (...) that way, we keep it clean but also allowing users to use complex neighborhood
cleansing.

Thoughts, @glemaitre ?

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Jul 25, 2016

@dvro I think that the solution with kwargs could be used. We could announce for the next release the change in the API.

Regarding passing the estimator itself, we can think about it and what it is implying on the API design. My only concern is that you are giving a lot of freedom to the user and at the end, getting maybe far from the spirit of the original research paper.

What do you think?

@chkoar
Copy link
Member Author

@chkoar chkoar commented Jul 25, 2016

https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/base.py#L162

All estimators should specify all the parameters that can be set
at the class level in their __init__ as explicit keyword
arguments (no *args or **kwargs).

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Jul 26, 2016

So I don't see other solutions than depreciation cycle such that it will change in version 0.4

@glemaitre glemaitre changed the title [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn [WIP] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Jul 26, 2016
@glemaitre
Copy link
Member

@glemaitre glemaitre commented Jul 31, 2016

@chkoar @dvro What is the state of that one. Shall we announce the deprecation?

@chkoar
Copy link
Member Author

@chkoar chkoar commented Aug 1, 2016

@glemaitre @dvro yes. We could work with **kwargs for n_neighbors for a version and announce a deprecation.

@dvro
Copy link
Member

@dvro dvro commented Aug 1, 2016

I agree!

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Aug 1, 2016

@chkoar we let you that one?

@chkoar
Copy link
Member Author

@chkoar chkoar commented Aug 1, 2016

Yeap. Actually, we have to consider removing all occurrences of **kwargs also.

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Aug 1, 2016

Yep I am working on the testing now, and so one of them in CenterClustering. Usually we used it when the user would like to pass some parameters for the classifier used.

@chkoar
Copy link
Member Author

@chkoar chkoar commented Aug 17, 2016

Replying to @dvro regarding the NearestNeighbors object.

We could have the n_neighbors parameter expecting an integer value in order to construct the NearestNeighbors object with reasonable default values. We could accept also a NearestNeighbors object instead of an integer value with all the parameters provided by the user. With this approach we could get rid of the **kwargs and let the user to provide his/her parameters for the NearestNeighbors estimator.

@chkoar
Copy link
Member Author

@chkoar chkoar commented Aug 30, 2016

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Aug 30, 2016

We could have the n_neighbors parameter expecting an integer value in order to construct the NearestNeighbors object with reasonable default values. We could accept also a NearestNeighbors object instead of an integer value with all the parameters provided by the user. With this approach we could get rid of the **kwargs and let the user to provide his/her parameters for the NearestNeighbors estimator.

It seems the best solution for me. At least, the back-compatibility is ensured.

@chkoar
Copy link
Member Author

@chkoar chkoar commented Aug 30, 2016

Ok, great. I'll make the changes at the first chance.

@glemaitre glemaitre added this to the 0.2.alpha milestone Sep 1, 2016
@glemaitre
Copy link
Member

@glemaitre glemaitre commented Oct 25, 2016

@chkoar Do you want me to get on that one?

@chkoar
Copy link
Member Author

@chkoar chkoar commented Oct 25, 2016

No problem. You can contribute in my branch or you create a new one and work there.

As you said we must provide both parameters in the constructor

  • Announce the deprecation
  • n_neighbors should be None as default (or vice versa)
  • internally we'll use n_neighbors
  • if both size_ngh and n_neighbors are present we should raise an exception (if this doesn't brake the check_estimator test) or a warning

What do you think @glemaitre ?

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Oct 25, 2016

I think that I already a fork of yours so I will go for that. All you points are correct and I will address them.

@chkoar
Copy link
Member Author

@chkoar chkoar commented Oct 25, 2016

@glemaitre great. There is a need for a rebase, though. So, rebase with the upstream, force push and I will accept all changes.

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Oct 25, 2016

@chkoar ok I am going for it now

@chkoar
Copy link
Member Author

@chkoar chkoar commented Oct 25, 2016

@glemaitre There is already a pending PR in chkoar#3

@glemaitre
Copy link
Member

@glemaitre glemaitre commented Oct 25, 2016

Yep I found that back in my branches ;)

@chkoar chkoar mentioned this pull request Oct 25, 2016
@chkoar chkoar force-pushed the chkoar:n_neighbors branch from d820807 to fc2d1c6 Oct 25, 2016
@coveralls
Copy link

@coveralls coveralls commented Oct 25, 2016

Coverage Status

Coverage decreased (-0.06%) to 98.943% when pulling fc2d1c6 on chkoar:n_neighbors into db85c84 on scikit-learn-contrib:master.

@glemaitre glemaitre changed the title [WIP] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Oct 26, 2016
@glemaitre
Copy link
Member

@glemaitre glemaitre commented Oct 26, 2016

@dvro @chkoar LGTM for merging

@chkoar
Copy link
Member Author

@chkoar chkoar commented Oct 26, 2016

@dvro this PR needs a sweet squash and merge commit

@coveralls
Copy link

@coveralls coveralls commented Oct 26, 2016

Coverage Status

Coverage decreased (-0.06%) to 98.943% when pulling a52fd28 on chkoar:n_neighbors into db85c84 on scikit-learn-contrib:master.

@glemaitre glemaitre changed the title [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn [MRG+1] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Oct 26, 2016
@glemaitre glemaitre merged commit 3ee7147 into scikit-learn-contrib:master Oct 26, 2016
5 checks passed
5 checks passed
ci/circleci Your tests passed on CircleCI!
Details
code-quality/landscape Code quality decreased by -0.19%
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.06%) to 98.943%
Details
@chkoar chkoar deleted the chkoar:n_neighbors branch Feb 3, 2017
christophe-rannou pushed a commit to christophe-rannou/imbalanced-learn that referenced this pull request Apr 3, 2017
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes
glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes
glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants