New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Change CV defaults to 5 #11557

Merged
merged 31 commits into from Jul 19, 2018

Conversation

Projects
None yet
7 participants
@aboucaud
Contributor

aboucaud commented Jul 16, 2018

Reference Issues/PRs

Fixes #11129 and takes over stalled PR #11139

What does this implement/fix? Explain your changes.

Add warning for models that do not specify an explicit value forcv or n_splits to prepare for a future deprecation of the default to 3 and an update of the default value to 5.

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre Jul 16, 2018

Contributor

I see this is still WIP but it could be worth mentioning that you will have to decorate the tests which use the cv using the @pytest.mark.filterwarning to avoid showing the deprecation warning.

Contributor

glemaitre commented Jul 16, 2018

I see this is still WIP but it could be worth mentioning that you will have to decorate the tests which use the cv using the @pytest.mark.filterwarning to avoid showing the deprecation warning.

@aboucaud aboucaud changed the title from [WIP] Change CV defaults to 5 to [MRG] Change CV defaults to 5 Jul 17, 2018

@GaelVaroquaux

Looks great so far. A couple minor comments.

Show outdated Hide outdated doc/modules/cross_validation.rst
Show outdated Hide outdated doc/whats_new/v0.20.rst
Show outdated Hide outdated sklearn/model_selection/_split.py
@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jul 17, 2018

Member

I canceled the travis build as @aboucaud is pushing a new version soon.

Member

GaelVaroquaux commented Jul 17, 2018

I canceled the travis build as @aboucaud is pushing a new version soon.

@GaelVaroquaux

LGTM.

+1 for merge.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 17, 2018

Member

looks good apart from None as sentinel vs 'warn'.

Member

amueller commented Jul 17, 2018

looks good apart from None as sentinel vs 'warn'.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 17, 2018

Member

lgtm

Member

amueller commented Jul 17, 2018

lgtm

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jul 17, 2018

Member

We'll merge when travis is ready.

Member

GaelVaroquaux commented Jul 17, 2018

We'll merge when travis is ready.

@GaelVaroquaux GaelVaroquaux changed the title from [MRG] Change CV defaults to 5 to [MRG+1] Change CV defaults to 5 Jul 17, 2018

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 17, 2018

Member

test errors :-/

Member

amueller commented Jul 17, 2018

test errors :-/

@aboucaud

This comment has been minimized.

Show comment
Hide comment
@aboucaud

aboucaud Jul 18, 2018

Contributor

Off to bed, will finish this tomorrow. Most of the work should be behind now.

Contributor

aboucaud commented Jul 18, 2018

Off to bed, will finish this tomorrow. Most of the work should be behind now.

aboucaud added some commits Jul 18, 2018

@aboucaud

This comment has been minimized.

Show comment
Hide comment
@aboucaud

aboucaud Jul 18, 2018

Contributor

Green ✌️ !
It was tougher than expected.

I still did not address properly @amueller comment since I only added # 0.22

can you please add a comment that this is about iid and add 0.22 so that we can grep for it once we need to remove it?

The difficulty is to separate the cases in which it was about cv and the others about n_splits since I have two different messages, and I was not brave enough to do that.

I could try to unify the warning messages (below) to catch a better part of the message in filterwarnings

NSPLIT_WARNING = (
    "You should specify a value for 'n_splits' instead of relying on the "
    "default value. The default value will change from 3 to 5 "
    "in version 0.22.")

CV_WARNING = (
    "You should specify a value for 'cv' instead of relying on the "
    "default value. The default value will change from 3 to 5 "
    "in version 0.22.")

WDYT ? @GaelVaroquaux @amueller

Contributor

aboucaud commented Jul 18, 2018

Green ✌️ !
It was tougher than expected.

I still did not address properly @amueller comment since I only added # 0.22

can you please add a comment that this is about iid and add 0.22 so that we can grep for it once we need to remove it?

The difficulty is to separate the cases in which it was about cv and the others about n_splits since I have two different messages, and I was not brave enough to do that.

I could try to unify the warning messages (below) to catch a better part of the message in filterwarnings

NSPLIT_WARNING = (
    "You should specify a value for 'n_splits' instead of relying on the "
    "default value. The default value will change from 3 to 5 "
    "in version 0.22.")

CV_WARNING = (
    "You should specify a value for 'cv' instead of relying on the "
    "default value. The default value will change from 3 to 5 "
    "in version 0.22.")

WDYT ? @GaelVaroquaux @amueller

@jeremiedbb

This comment has been minimized.

Show comment
Hide comment
@jeremiedbb

jeremiedbb Jul 18, 2018

Contributor

Is skipping all the doctests the right way to make travis green ?
I may be interpreting what you did badly.

Contributor

jeremiedbb commented Jul 18, 2018

Is skipping all the doctests the right way to make travis green ?
I may be interpreting what you did badly.

@aboucaud

This comment has been minimized.

Show comment
Hide comment
@aboucaud

aboucaud Jul 18, 2018

Contributor

When I merged master into this branch, I saw that others implemented that workaround since now warning raise errors.

I agree it is probably not good thing.

Many of this failing tests used the default value for cv or n_splits which was set to 3 and will change to 5 but statically setting cv=5 also means increasing the size of the X and y arrays and adapting to the results.

I just end up having so many modifications in this PR that cannot be properly checked or reviewed that I would be in favor of having a following PR address the doctests, using # doctest: +SKIP as an anchor.

Contributor

aboucaud commented Jul 18, 2018

When I merged master into this branch, I saw that others implemented that workaround since now warning raise errors.

I agree it is probably not good thing.

Many of this failing tests used the default value for cv or n_splits which was set to 3 and will change to 5 but statically setting cv=5 also means increasing the size of the X and y arrays and adapting to the results.

I just end up having so many modifications in this PR that cannot be properly checked or reviewed that I would be in favor of having a following PR address the doctests, using # doctest: +SKIP as an anchor.

@qinhanmin2014 qinhanmin2014 added this to the 0.20 milestone Jul 18, 2018

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 18, 2018

Contributor

I just end up having so many modifications in this PR that cannot be properly checked or reviewed that I would be in favor of having a following PR address the doctests, using # doctest: +SKIP as an anchor.

Guillaume: if you set it to 5 manually in the doc examples, is it then still needed to skip?

Contributor

jorisvandenbossche commented Jul 18, 2018

I just end up having so many modifications in this PR that cannot be properly checked or reviewed that I would be in favor of having a following PR address the doctests, using # doctest: +SKIP as an anchor.

Guillaume: if you set it to 5 manually in the doc examples, is it then still needed to skip?

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jul 18, 2018

Member
Member

GaelVaroquaux commented Jul 18, 2018

@aboucaud

This comment has been minimized.

Show comment
Hide comment
@aboucaud

aboucaud Jul 18, 2018

Contributor

Ok, I'm on it

Contributor

aboucaud commented Jul 18, 2018

Ok, I'm on it

@aboucaud

This comment has been minimized.

Show comment
Hide comment
@aboucaud

aboucaud Jul 18, 2018

Contributor

@GaelVaroquaux can you interrupt the build on the first commit to let the last one build

Contributor

aboucaud commented Jul 18, 2018

@GaelVaroquaux can you interrupt the build on the first commit to let the last one build

@jeremiedbb

This comment has been minimized.

Show comment
Hide comment
@jeremiedbb

jeremiedbb Jul 18, 2018

Contributor

It restarts automatically each time you push

Contributor

jeremiedbb commented Jul 18, 2018

It restarts automatically each time you push

@GaelVaroquaux

I saw a few change in the doctest pragma that didn't look right.

Aside from that, +1 for merge.

Show outdated Hide outdated doc/modules/ensemble.rst
Show outdated Hide outdated doc/modules/learning_curve.rst
@jeremiedbb

This comment has been minimized.

Show comment
Hide comment
@jeremiedbb

jeremiedbb Jul 18, 2018

Contributor

Why did you remove many # doctest : NORMALIZE_WHITESPACE and ELLIPSIS ?

Contributor

jeremiedbb commented Jul 18, 2018

Why did you remove many # doctest : NORMALIZE_WHITESPACE and ELLIPSIS ?

aboucaud added some commits Jul 19, 2018

@jeremiedbb

This comment has been minimized.

Show comment
Hide comment
@jeremiedbb

jeremiedbb Jul 19, 2018

Contributor

@GaelVaroquaux alex made the requested changes. I think it's good to go now.

Contributor

jeremiedbb commented Jul 19, 2018

@GaelVaroquaux alex made the requested changes. I think it's good to go now.

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jul 19, 2018

Member

LGTM. Merging

Member

GaelVaroquaux commented Jul 19, 2018

LGTM. Merging

@GaelVaroquaux GaelVaroquaux merged commit f158e2d into scikit-learn:master Jul 19, 2018

7 checks passed

ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: python2 Your tests passed on CircleCI!
Details
ci/circleci: python3 Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 95.29%)
Details
codecov/project 95.3% (+<.01%) compared to 5140762
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@aboucaud aboucaud deleted the aboucaud:cv-default-5 branch Jul 19, 2018

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 20, 2018

Member

Ohhh yeahhh!

Member

amueller commented Jul 20, 2018

Ohhh yeahhh!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment