New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn #109

Merged
merged 5 commits into from Oct 26, 2016

Conversation

Projects
5 participants
@chkoar
Member

chkoar commented Jul 25, 2016

For consistency reasons I think that we should follow scikit-learn conventions in naming the parameters.
I propose to change the size_ngh parameter to n_neighbors. Unfortunately, this change will have impact in the public API. It is an early modification but it will break users code. I don't know if we could merge this change without a deprecation warning.

@dvro

This comment has been minimized.

Member

dvro commented Jul 25, 2016

@chkoar we could leave size_ngh and include a warning, allowing n_neighbors in a kwargs.

IMO, the user would pass either the n_neighbors parameter
but we also could allow someone to pass the NearestNeighbors object so that they could
use customized decision metrics, radius, n_neighbors, variable n_neighbors based on cluster
analysis (...) that way, we keep it clean but also allowing users to use complex neighborhood
cleansing.

Thoughts, @glemaitre ?

@glemaitre

This comment has been minimized.

Member

glemaitre commented Jul 25, 2016

@dvro I think that the solution with kwargs could be used. We could announce for the next release the change in the API.

Regarding passing the estimator itself, we can think about it and what it is implying on the API design. My only concern is that you are giving a lot of freedom to the user and at the end, getting maybe far from the spirit of the original research paper.

What do you think?

@chkoar

This comment has been minimized.

Member

chkoar commented Jul 25, 2016

https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/base.py#L162

All estimators should specify all the parameters that can be set
at the class level in their __init__ as explicit keyword
arguments (no *args or **kwargs).

@glemaitre

This comment has been minimized.

Member

glemaitre commented Jul 26, 2016

So I don't see other solutions than depreciation cycle such that it will change in version 0.4

@glemaitre glemaitre changed the title from [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn to [WIP] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Jul 26, 2016

@glemaitre

This comment has been minimized.

Member

glemaitre commented Jul 31, 2016

@chkoar @dvro What is the state of that one. Shall we announce the deprecation?

@chkoar

This comment has been minimized.

Member

chkoar commented Aug 1, 2016

@glemaitre @dvro yes. We could work with **kwargs for n_neighbors for a version and announce a deprecation.

@dvro

This comment has been minimized.

Member

dvro commented Aug 1, 2016

I agree!

@glemaitre

This comment has been minimized.

Member

glemaitre commented Aug 1, 2016

@chkoar we let you that one?

@chkoar

This comment has been minimized.

Member

chkoar commented Aug 1, 2016

Yeap. Actually, we have to consider removing all occurrences of **kwargs also.

@glemaitre

This comment has been minimized.

Member

glemaitre commented Aug 1, 2016

Yep I am working on the testing now, and so one of them in CenterClustering. Usually we used it when the user would like to pass some parameters for the classifier used.

@chkoar

This comment has been minimized.

Member

chkoar commented Aug 17, 2016

Replying to @dvro regarding the NearestNeighbors object.

We could have the n_neighbors parameter expecting an integer value in order to construct the NearestNeighbors object with reasonable default values. We could accept also a NearestNeighbors object instead of an integer value with all the parameters provided by the user. With this approach we could get rid of the **kwargs and let the user to provide his/her parameters for the NearestNeighbors estimator.

@chkoar

This comment has been minimized.

Member

chkoar commented Aug 30, 2016

@glemaitre

This comment has been minimized.

Member

glemaitre commented Aug 30, 2016

We could have the n_neighbors parameter expecting an integer value in order to construct the NearestNeighbors object with reasonable default values. We could accept also a NearestNeighbors object instead of an integer value with all the parameters provided by the user. With this approach we could get rid of the **kwargs and let the user to provide his/her parameters for the NearestNeighbors estimator.

It seems the best solution for me. At least, the back-compatibility is ensured.

@chkoar

This comment has been minimized.

Member

chkoar commented Aug 30, 2016

Ok, great. I'll make the changes at the first chance.

@glemaitre glemaitre added this to the 0.2.alpha milestone Sep 1, 2016

@glemaitre

This comment has been minimized.

Member

glemaitre commented Oct 25, 2016

@chkoar Do you want me to get on that one?

@chkoar

This comment has been minimized.

Member

chkoar commented Oct 25, 2016

No problem. You can contribute in my branch or you create a new one and work there.

As you said we must provide both parameters in the constructor

  • Announce the deprecation
  • n_neighbors should be None as default (or vice versa)
  • internally we'll use n_neighbors
  • if both size_ngh and n_neighbors are present we should raise an exception (if this doesn't brake the check_estimator test) or a warning

What do you think @glemaitre ?

@glemaitre

This comment has been minimized.

Member

glemaitre commented Oct 25, 2016

I think that I already a fork of yours so I will go for that. All you points are correct and I will address them.

@chkoar

This comment has been minimized.

Member

chkoar commented Oct 25, 2016

@glemaitre great. There is a need for a rebase, though. So, rebase with the upstream, force push and I will accept all changes.

@glemaitre

This comment has been minimized.

Member

glemaitre commented Oct 25, 2016

@chkoar ok I am going for it now

@chkoar

This comment has been minimized.

Member

chkoar commented Oct 25, 2016

@glemaitre There is already a pending PR in chkoar#3

@glemaitre

This comment has been minimized.

Member

glemaitre commented Oct 25, 2016

Yep I found that back in my branches ;)

@chkoar chkoar referenced this pull request Oct 25, 2016

Closed

Remove **kwargs #168

@chkoar chkoar force-pushed the chkoar:n_neighbors branch from d820807 to fc2d1c6 Oct 25, 2016

@coveralls

This comment has been minimized.

coveralls commented Oct 25, 2016

Coverage Status

Coverage decreased (-0.06%) to 98.943% when pulling fc2d1c6 on chkoar:n_neighbors into db85c84 on scikit-learn-contrib:master.

@glemaitre glemaitre changed the title from [WIP] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn to [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Oct 26, 2016

@glemaitre

This comment has been minimized.

Member

glemaitre commented Oct 26, 2016

@dvro @chkoar LGTM for merging

@chkoar

This comment has been minimized.

Member

chkoar commented Oct 26, 2016

@dvro this PR needs a sweet squash and merge commit

@coveralls

This comment has been minimized.

coveralls commented Oct 26, 2016

Coverage Status

Coverage decreased (-0.06%) to 98.943% when pulling a52fd28 on chkoar:n_neighbors into db85c84 on scikit-learn-contrib:master.

@glemaitre glemaitre changed the title from [MRG] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn to [MRG+1] Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn Oct 26, 2016

@glemaitre glemaitre merged commit 3ee7147 into scikit-learn-contrib:master Oct 26, 2016

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
code-quality/landscape Code quality decreased by -0.19%
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.06%) to 98.943%
Details

@chkoar chkoar deleted the chkoar:n_neighbors branch Feb 3, 2017

christophe-rannou pushed a commit to christophe-rannou/imbalanced-learn that referenced this pull request Apr 3, 2017

[MRG+1] Rename all occurrences of size_ngh to n_neighbors for consist…
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes

glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017

[MRG+1] Rename all occurrences of size_ngh to n_neighbors for consist…
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes

glemaitre added a commit to glemaitre/imbalanced-learn that referenced this pull request Jun 15, 2017

[MRG+1] Rename all occurrences of size_ngh to n_neighbors for consist…
…ency with scikit-learn (scikit-learn-contrib#109)

* Rename all occurrences of size_ngh to n_neighbors for consistency with scikit-learn.

* Implement deprecation for smote_enn and enn

* Add the changes in documentation

* Make the changes in the base function

* Minor comment fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment