-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[MRG+1] fixed OOB_Score bug for bagging classifiers. #8936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def test_set_oob_score(): | ||
# Make sure the oob_score doesn't change when the labels change | ||
# See: https://github.com/scikit-learn/scikit-learn/issues/8933 | ||
N = 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a non-regression test, I don't see why this is any better than asserting that
BaggingClassifier(oob_score=True).fit([[0], [1]] * 5, ['A', 'B'] * 5).oob_score_
is approximately 1... or, at a minimum, not 0, which the current bug gives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured testing a "failing" label integer set ([-1,0,1]), a "standard" label set ([0,1,2]), and an alphanumeric label set (['a','b,'c']) would sufficiently exhaustive. Whatever your preference.
assert_equal(bagging._max_samples, max_samples) | ||
|
||
|
||
def test_set_oob_score(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps test_set_oob_score_label_encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change the test name to that if it's preferred
yes, testing all is fine, but no need for random data. make it easy to read
the test
…On 26 May 2017 2:00 pm, "mlewis1729" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/ensemble/tests/test_bagging.py
<#8936 (comment)>
:
> @@ -723,3 +723,37 @@ def test_max_samples_consistency():
max_features=0.5, random_state=1)
bagging.fit(X, y)
assert_equal(bagging._max_samples, max_samples)
+
+
+def test_set_oob_score():
+ # Make sure the oob_score doesn't change when the labels change
+ # See: #8933
+ N = 50
I figured testing a "failing" label integer set ([-1,0,1]), a "standard"
label set ([0,1,2]), and an alphanumeric label set (['a','b,'c']) would
sufficiently exhaustive. Whatever your preference.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8936 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz64Q-B1iH2bDVI_NrLCsIip_-4-wmks5r9k5CgaJpZM4NmxZg>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise lgtm
def test_set_oob_score_label_encoding(): | ||
# Make sure the oob_score doesn't change when the labels change | ||
# See: https://github.com/scikit-learn/scikit-learn/issues/8933 | ||
randState = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should avoid camel case for such variables
I'll fix in master the CamelCase and add a what's new entry thx @mlewis1729 |
see 6d2e9f2 |
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
* fixed OOB_Score bug for bagging slassifiers. See: scikit-learn#8933 * Added white space * more white space fixing * Adding test for oob_score validity * removing pandas, replacing with numpy matrices * fixing white space * more white space fixing * white space ... * fixed labels to allow for strings * white space * simplifying test * white space * reformatting test * white space * pressed enter at end of file * removing line at end of file
Fixes #8933
Reference Issue
What does this implement/fix? Explain your changes.
Any other comments?