Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] BUG Fix the shrinkage implementation in NearestCentroid #9219

Merged
merged 8 commits into from Jul 25, 2017
Merged

[MRG+1] BUG Fix the shrinkage implementation in NearestCentroid #9219

merged 8 commits into from Jul 25, 2017

Conversation

@qinhanmin2014
Copy link
Member

@qinhanmin2014 qinhanmin2014 commented Jun 25, 2017

Reference Issue

Fixes #7679

What does this implement/fix? Explain your changes.

Main reference:
(1)the detailed discussion in issue #7679
(2)The Elements of Statistical Learning P652-653
(3)https://github.com/cran/pamr/blob/master/R/nsc.R#L82
(implemented by the author of the original paper)

Current test:
Current tests are based on the prediction result. No test check the shrinking process.
This is why the problem is not detected.

New regression test:
New test is based on R (pamr), which is implemented by the author of the original paper.
old result(wrong):
2017-06-25_230300

new result(right):
2017-06-25_230237

result by R (pamr):
2017-06-25_230326

Any other comments?

Maybe milestone can be changed back to 0.19?

Copy link
Member

@jnothman jnothman left a comment

0.19 is too close to promise. But if you expect to have this included, don't mark it WIP; and add an entry to what's new.

Good work!

@@ -97,6 +97,20 @@ def test_pickle():
" after pickling (classification).")


def test_shrinkage():
# Ensure that the shrinking is correct.
# The corrent result is calculated by R (pamr),

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

*correct

# Ensure that the shrinking is correct.
# The corrent result is calculated by R (pamr),
# which is implemented by the author of the original paper.
# (One need to modify the code to output the new centroid in pamr.predict)

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

I think what you mean is "This is a regression test for issue #7679"

This comment has been minimized.

@qinhanmin2014

qinhanmin2014 Jun 25, 2017
Author Member

@jnothman Thanks. This is a regression test. But since it ensures that the shrinking is correct, I choose not to mention the issue number here.

@jnothman jnothman added this to the 0.19 milestone Jun 25, 2017
@jnothman jnothman added the Bug label Jun 25, 2017
@jnothman
Copy link
Member

@jnothman jnothman commented Jun 25, 2017

LGTM

@jnothman jnothman changed the title [WIP] BUG Fix the shrinkage implementation in NearestCentroid [MRG+1] BUG Fix the shrinkage implementation in NearestCentroid Jun 25, 2017

X = np.array([[0, 1], [1, 0], [1, 1], [2, 0], [6, 8]])
y = np.array([1, 1, 2, 2, 2])
clf = NearestCentroid(shrink_threshold=0.1)

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

Is there value in testing a second threshold?

This comment has been minimized.

@qinhanmin2014

qinhanmin2014 Jun 25, 2017
Author Member

@jnothman Thanks, but do you think it is necessary to test another threshold? Frankly I may not think so(and I have tried several cases on my PC). Or I may misunderstand your meaning.

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

Probably not

@qinhanmin2014
Copy link
Member Author

@qinhanmin2014 qinhanmin2014 commented Jun 25, 2017

Function name and what's new updated. Thanks for your quick reply :) @jnothman

# Ensure that the shrinking is correct.
# The corrent result is calculated by R (pamr),
# which is implemented by the author of the original paper.
# (One need to modify the code to output the new centroid in pamr.predict)

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

Ah sorry, I misunderstood this. I get it now. Okay


X = np.array([[0, 1], [1, 0], [1, 1], [2, 0], [6, 8]])
y = np.array([1, 1, 2, 2, 2])
clf = NearestCentroid(shrink_threshold=0.1)

This comment has been minimized.

@jnothman

jnothman Jun 25, 2017
Member

Probably not

@jnothman
Copy link
Member

@jnothman jnothman commented Jun 25, 2017

Please also add this to the list of models where prediction will change, at the top of what's new. We need to get that in order.

@qinhanmin2014
Copy link
Member Author

@qinhanmin2014 qinhanmin2014 commented Jun 26, 2017

@jnothman Thanks. Finished. I also checked the code in the document, user guide and examples, it seems that nothing need to be modified(though the result is sligtly changed). As is pointed out in the issue, in practice when n is large, the result is almost the same.

@jnothman
Copy link
Member

@jnothman jnothman commented Jun 26, 2017

Great, thanks.

@qinhanmin2014
Copy link
Member Author

@qinhanmin2014 qinhanmin2014 commented Jun 27, 2017

ping @amueller

@qinhanmin2014
Copy link
Member Author

@qinhanmin2014 qinhanmin2014 commented Jul 25, 2017

@jnothman @amueller
It has been a month since the last reply. I have fixed the conflict. Could you please help me finish it or ping someone to help? Thanks.

@jnothman
Copy link
Member

@jnothman jnothman commented Jul 25, 2017

Ping @lesteve, perhaps?

@lesteve
Copy link
Member

@lesteve lesteve commented Jul 25, 2017

LGTM. This can be merged once AppVeyor is green.

@qinhanmin2014
Copy link
Member Author

@qinhanmin2014 qinhanmin2014 commented Jul 25, 2017

@jnothman @lesteve
Thanks. CI is green.

@amueller amueller merged commit 90607f1 into scikit-learn:master Jul 25, 2017
5 checks passed
5 checks passed
ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.19%)
Details
codecov/project 96.19% (+<.01%) compared to 511bbc7
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@amueller
Copy link
Member

@amueller amueller commented Jul 25, 2017

Thanks @qinhanmin2014 :)

@qinhanmin2014 qinhanmin2014 deleted the qinhanmin2014:my-feature-4 branch Jul 26, 2017
jnothman added a commit to jnothman/scikit-learn that referenced this pull request Aug 6, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
…it-learn#9219)

* fix the shrinkage implementation

* update function name

* update what's new

* update what's new

* spelling

* confict fix

* conflict fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants