New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] EHN: Change default n_estimators to 100 for random forest #11542

Merged
merged 14 commits into from Jul 17, 2018

Conversation

Projects
None yet
5 participants
@annaayzenshtat
Contributor

annaayzenshtat commented Jul 15, 2018

Reference Issues/PRs

Fixes #11128.

What does this implement/fix? Explain your changes.

Issues deprecation warning message for the default n_estimators parameter for the forest classifiers. Test added for the warning message when the default parameter is used.

Any other comments?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 15, 2018

Member

Is this based on #11172? The contributor there seems to have addressed the comments there yesterday...

Member

amueller commented Jul 15, 2018

Is this based on #11172? The contributor there seems to have addressed the comments there yesterday...

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 15, 2018

Member

though it looks like #11172 is still not right...

Member

amueller commented Jul 15, 2018

though it looks like #11172 is still not right...

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 15, 2018

Member

though it looks like #11172 is still not right...

This looks pretty good. Ideally you'd catch also deprecation warnings if they are raised in the tests now.

Member

amueller commented Jul 15, 2018

though it looks like #11172 is still not right...

This looks pretty good. Ideally you'd catch also deprecation warnings if they are raised in the tests now.

@glemaitre

You will also need to add an entry in the what's new file for v0.20 stating the change of behavior in the future.

@@ -242,6 +242,12 @@ def fit(self, X, y, sample_weight=None):
-------
self : object
"""
if self.n_estimators == 'warn':

This comment has been minimized.

@glemaitre

glemaitre Jul 15, 2018

Contributor

The check and validation should be done in fit instead of __init__

@glemaitre

glemaitre Jul 15, 2018

Contributor

The check and validation should be done in fit instead of __init__

This comment has been minimized.

@annaayzenshtat

annaayzenshtat Jul 15, 2018

Contributor

So should I change back to n_estimators=10 instead of n_estimators='warn', and then change my if conditional check in the fit() method?

@annaayzenshtat

annaayzenshtat Jul 15, 2018

Contributor

So should I change back to n_estimators=10 instead of n_estimators='warn', and then change my if conditional check in the fit() method?

This comment has been minimized.

@amueller

amueller Jul 15, 2018

Member

no the warn is good, just the test should be in the other place.

@amueller

amueller Jul 15, 2018

Member

no the warn is good, just the test should be in the other place.

This comment has been minimized.

@glemaitre

glemaitre Jul 16, 2018

Contributor

You can refer to: https://github.com/scikit-learn/scikit-learn/pull/11469/files#diff-e6faf37b13574bc591afbf0536128735R864

This is still not merged but we follow this convention: __init__ just assign the parameters to the class attributes and we do checking and validation in the fit method.

@glemaitre

glemaitre Jul 16, 2018

Contributor

You can refer to: https://github.com/scikit-learn/scikit-learn/pull/11469/files#diff-e6faf37b13574bc591afbf0536128735R864

This is still not merged but we follow this convention: __init__ just assign the parameters to the class attributes and we do checking and validation in the fit method.

This comment has been minimized.

@annaayzenshtat

annaayzenshtat Jul 16, 2018

Contributor

Aren't lines 245 and 246 above inside the fit() method?

@annaayzenshtat

annaayzenshtat Jul 16, 2018

Contributor

Aren't lines 245 and 246 above inside the fit() method?

This comment has been minimized.

@glemaitre

glemaitre Jul 16, 2018

Contributor

Ups sorry it is good there. I good confused with another PR :)

@glemaitre

glemaitre Jul 16, 2018

Contributor

Ups sorry it is good there. I good confused with another PR :)

Show outdated Hide outdated sklearn/ensemble/tests/test_forest.py
Show outdated Hide outdated sklearn/ensemble/tests/test_forest.py

@glemaitre glemaitre changed the title from Fix to Issue #11128: Create deprecation warning for default n_estimators in RandomForest to EHN: Change default n_estimators to 100 for random forest Jul 16, 2018

@glemaitre glemaitre changed the title from EHN: Change default n_estimators to 100 for random forest to [MRG] EHN: Change default n_estimators to 100 for random forest Jul 16, 2018

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 16, 2018

Contributor

FYI: I updated the title of this PR.

Contributor

glemaitre commented Jul 16, 2018

FYI: I updated the title of this PR.

@massich

This comment has been minimized.

Show comment
Hide comment
@massich

massich Jul 16, 2018

Contributor

@annaayzenshtat this is a blocker for 0.20 (which we are actively working on right now). If you don't have time to address the comments at this moment that's completely fine. Ping me and I'll take over the PR.

Contributor

massich commented Jul 16, 2018

@annaayzenshtat this is a blocker for 0.20 (which we are actively working on right now). If you don't have time to address the comments at this moment that's completely fine. Ping me and I'll take over the PR.

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 16, 2018

Contributor

I'm still working on this issue

Contributor

annaayzenshtat commented Jul 16, 2018

I'm still working on this issue

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

I committed the requested changes. Please take a look at these code changes.

Contributor

annaayzenshtat commented Jul 17, 2018

I committed the requested changes. Please take a look at these code changes.

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 17, 2018

Contributor

Actually you need to flag the tests with pytest.mark.filterwarnings to avoid raising the future warning in the tests (typically the one that does not set n_estimators)

Contributor

glemaitre commented Jul 17, 2018

Actually you need to flag the tests with pytest.mark.filterwarnings to avoid raising the future warning in the tests (typically the one that does not set n_estimators)

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

Ok, I'll change it.

Contributor

annaayzenshtat commented Jul 17, 2018

Ok, I'll change it.

@glemaitre

You can check this PR as an example how to use pytest

https://github.com/scikit-learn/scikit-learn/pull/11574/files

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

I flagged the test with pytest.mark.filterwarnings.

Contributor

annaayzenshtat commented Jul 17, 2018

I flagged the test with pytest.mark.filterwarnings.

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 17, 2018

Contributor

@annaayzenshtat I am helping a bit with the failure that you got and I am filtering the warning because it seems that they are in a lot of places.

Contributor

glemaitre commented Jul 17, 2018

@annaayzenshtat I am helping a bit with the failure that you got and I am filtering the warning because it seems that they are in a lot of places.

glemaitre added some commits Jul 17, 2018

@amueller

lgtm if tests pass

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

Ok, thank you!

Contributor

annaayzenshtat commented Jul 17, 2018

Ok, thank you!

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 17, 2018

Member

python2.7 test failure :-/

Member

amueller commented Jul 17, 2018

python2.7 test failure :-/

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 17, 2018

Member

In SAG?!

Member

amueller commented Jul 17, 2018

In SAG?!

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

Is there something I'm supposed to do to fix the Python 2.7 failure?

Contributor

annaayzenshtat commented Jul 17, 2018

Is there something I'm supposed to do to fix the Python 2.7 failure?

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 17, 2018

Contributor

Nop this is some side effect already shown and solve in #11574

Contributor

glemaitre commented Jul 17, 2018

Nop this is some side effect already shown and solve in #11574

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

Ok.

Contributor

annaayzenshtat commented Jul 17, 2018

Ok.

@amueller amueller merged commit 2242c59 into scikit-learn:master Jul 17, 2018

0 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
ci/circleci: python2 Your tests are queued behind your running builds
Details
ci/circleci: python3 Your tests are queued behind your running builds
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 17, 2018

Contributor

@annaayzenshtat Thanks a lot for the contribution.
Feel free to take any other issue ;)

Contributor

glemaitre commented Jul 17, 2018

@annaayzenshtat Thanks a lot for the contribution.
Feel free to take any other issue ;)

@annaayzenshtat annaayzenshtat deleted the annaayzenshtat:fix/n_estimators_100 branch Jul 17, 2018

@annaayzenshtat

This comment has been minimized.

Show comment
Hide comment
@annaayzenshtat

annaayzenshtat Jul 17, 2018

Contributor

Thank you!

Contributor

annaayzenshtat commented Jul 17, 2018

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment