Clone estimator for each parameter value in validation_curve #7365

Sundrique · 2016-09-08T14:09:43Z

In validation_curve function an estimator is not cloned like in learning_curve or GridSearchCV.fit for each value of a parameter. Thus it comes pre-trained to each iteration but very first.

amueller · 2016-09-08T14:18:11Z

thanks for the fix. Do you think you can create a non-regression test?

Sundrique · 2016-09-08T14:24:41Z

I think, I can if it can wait a couple of days.

amueller · 2016-09-08T14:25:54Z

sure, no worries :)

jnothman · 2016-09-21T12:23:18Z

I find your mock estimator a bit of an obtuse way to test this. How about a class inheriting from something already in scikit-learn (even a dummy estimator) that simply raises an error if fit is called on it multiple times?

    def fit(self, X, y):
        assert_false(hasattr(self, 'fit_called_'))
        self.fit_called_ = True
        super(type(self), self).fit(X, y)

ogrisel · 2016-09-21T13:42:40Z

Please also make sure that the lines you change in this PR follow the PEP8 conventions (travis is read because of this):

https://travis-ci.org/scikit-learn/scikit-learn/jobs/160421906#L457

Sundrique · 2016-09-23T09:21:48Z

Thanks @jnothman, @ogrisel. Will fix.

jnothman · 2016-09-26T11:09:51Z

Actually, this change needs to be in model_selection/_validation.py, although I think it's a good idea to backport it to the deprecated module you've currently edited. Otherwise, LGTM. (And I confirm the test fails at master.)

jnothman · 2017-06-14T00:47:38Z

This only needs a little fixing up to be merged and included in the upcoming release. @Sundrique, are you onto it? Please also add an entry to doc/whats_new.rst

Sundrique · 2017-06-14T01:09:06Z

@jnothman Yes, I am. Will add into model_selection/_validation.py, resolve the conflict and add an entry to doc/whats_new.rst. Please, let me know if anything else is needed.

jnothman · 2017-06-14T01:22:29Z

Hopefully after those changes, @ogrisel can give this a final review and merge.

…

On 14 June 2017 at 11:09, Aleksandr Sandrovskii ***@***.***> wrote: @jnothman <https://github.com/jnothman> Yes, I am. Will add into model_selection/_validation.py, resolve the conflict and add an entry to doc/whats_new.rst. Please, let me know if anything else is needed. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7365 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz69DbZejMdY3JbhEs1a6Vr9BaQSntks5sDzK1gaJpZM4J4CRC> .

…-learn#8469)

…#8495)

* BUG: fix svd_solver validation in PCA.fit * TST: add test of pca svd_solver

* Modify plot_custom_kernel for matplotlib v2 comp Add `edgecolors` attribute in scatter plot for better visualization in matplotlib version 2 Issue: scikit-learn#8364 * Modify plot_oneclass.py for matplotlib v2 comp Add `edgecolors` attribute to scatter plot for better visualization in matplotlib version 2 Issue: scikit-learn#8364 * Modify plot_rbf_parameters for matplotlib v2 Add `edgecolors` attribute to scatter plot for better visualization. Issue: scikit-learn#8364 * Modify plot_separating_hyperplane_unbalanced for matplotlib v2 Add `edgecolors` attribute to scatter plot for better visualization. Issue: scikit-learn#8364 * Modify plo_svm_kernels for matplotlib v2 Add `edgecolors` attribute to scatter plot for better visualization. Issue: scikit-learn#8364 * Modify plot_svm_margin for matplotlib v2 comp Add `edgecolors` attribute to scatter plot for better visualization. Issue: scikit-learn#8364 * Modify plot_svm_nonlinear for matplotlib v2 Add `edgecolors` attribute to scatter plot for matplotlib version 2 compatibility Issue: scikit-learn#8364 * Modify file for remove flake8 error Remove extra white space. Issue: scikit-learn#8364

* add html-noplot and help message to make.bat * changed spaces to tab in make.bat help * changed all spaces to tabs in make.bat update

…8120) * Add _RepeatedSplits and RepeatedKFold class * Add RepeatedStratifiedKFold and doc for repeated cvs * Change default value of n_repeats * Change input parameters of repeated cv constructor to n_splits, n_repeats, random_state * Generate random states in split function rather than store it beforehand * Doc changes, inheriting RepeatedKFold, RepeatedStratifiedKFold from _RepeatedSplits and other review changes * Remove blank line, put testcases for deterministic split in loop and add StopIteration check in testcase * Using rng directly as random_state param to create cv instance and added a check for cvargs * Fix pep8 warnings * Changing default values for n_splits and n_repeats and add entry in changelog * Adding name to the feature * Missing space

[MRG+2] modify disadvantage

…rn#8527)

…weight_invariance in metrics/tests/test_common.py (scikit-learn#8537) * Separated tests for regression features in test_sample_weight_invariance * Fixed pep8 * Removed unecessary check for regression * Updated regression metrics * Joel's suggestions

…y in estimator_checks (scikit-learn#8253)

…mentation (scikit-learn#8548) * clarify role of the function and streamline introduction * added feature selection methods to see also * completed see also * fixed pep related formatting for flake8checks. * fixed extra whitespace flake8 problems, remaining failure is a copied see all line from another function, the line is over by a period, does not make sense to newline that. * one more whitespace * FIX small pep8 error.

Use pip rather than easy_install in copy_joblib.sh. Also need to remove joblib/testing.py to avoid pytest dependency.

…arn#8530) (scikit-learn#8538)

* DOC add hyperlink to example * Remove useless change * DOC fix hyperlink * DOC fix links

* Add deprecation message and test. * Adding performance warning and ignore_warnings in test * Add deprecation to whatsnew and remove LSHForest references from docs. Removing benchmark for lsh

…ll hierarchy (scikit-learn#9004) * FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit * cleanup * fix deprecation warning + clarify warning * add test * pep8 * adddress comments

* fix OVR classifier edgecase bugs * add regression tests for OVO and OVR decision function shapes

… classifier (scikit-learn#9063) * FIX/TST revert scikit-learn#5802 and raise error for faulty classifier * FIX check_estimator take care of the rest

* correcting information criterion calculation in least_angle.py The information criterion calculation is not compatible with the original paper Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192. APA * FIX : fix AIC/BIC computation in LassoLarsIC * update what's new * fix test * fix test * address comments * DOC comments and docstring on criterion computation

with sphinx 1.6

to fix pdf doc generation

Long commit messages can trigger a pager which is not what you want when running flake8_diff.sh in a terminal.

now that the sprint is over.

…ressor, added 2 tests (scikit-learn#8931)

* regression test and fix for 2d stratified shuffle split * strengthen non-overlap sss tests * clarify test and comment * remove iter from tests, use str instead of hash

Sundrique · 2017-06-14T05:37:47Z

@jnothman Sorry, screwed up the branch. Created a new one #9119.

amueller added this to the 0.19 milestone Oct 27, 2016

mathurinm and others added 19 commits June 14, 2017 11:42

Correct figure number + matplotlib 2 (scikit-learn#8483)

4d98c40

DOC example of extracting true positive, false negative, etc. (scikit…

1645543

…-learn#8469)

DOC correct typo in kneighbors parameter documentation. (scikit-learn…

332ea29

…#8495)

[MRG+1] BUG: fix svd_solver validation in PCA.fit (scikit-learn#8496)

6d75dd7

* BUG: fix svd_solver validation in PCA.fit * TST: add test of pca svd_solver

Added check_X_y to lasso_stability_path() (scikit-learn#7534)

d656977

codecov: disable comments (scikit-learn#8502)

ec91985

turn comments off in codecov

6241b06

add html-noplot and changed help message to make.bat (scikit-learn#8524)

5c92a20

* add html-noplot and help message to make.bat * changed spaces to tab in make.bat help * changed all spaces to tabs in make.bat update

modify disadvantage (scikit-learn#8521)

c5c189b

[MRG+2] modify disadvantage

fix deprecated comparison to string in GP (scikit-learn#8518)

8a1be77

[MRG+2] referred reliability diagrams and added citations (scikit-lea…

1a3d4d2

…rn#8527)

[MRG] removed download_url from setup.py (scikit-learn#8513)

db5cfce

[MRG+1] Fixes scikit-learn#7578 added check_decision_proba_consistenc…

19059a3

…y in estimator_checks (scikit-learn#8253)

[MRG] Update joblib to 0.11 (scikit-learn#8492)

5e5318f

Use pip rather than easy_install in copy_joblib.sh. Also need to remove joblib/testing.py to avoid pytest dependency.

[MRG] DOC More detailed pull request and fork instructions (scikit-le…

df962e1

…arn#8530) (scikit-learn#8538)

naoyak and others added 26 commits June 14, 2017 11:42

Add logsumexp and comb to utils.fixes (scikit-learn#9046)

6a822ba

[MRG+1] DOC add hyperlink to example (scikit-learn#9097)

9718494

* DOC add hyperlink to example * Remove useless change * DOC fix hyperlink * DOC fix links

[MRG] Deprecate lsh forest (scikit-learn#9078)

32c9489

* Add deprecation message and test. * Adding performance warning and ignore_warnings in test * Add deprecation to whatsnew and remove LSHForest references from docs. Removing benchmark for lsh

[MRG+1] remove n_nonzero_coefs from attr of LassoLarsCV + clean up ca…

937c94c

…ll hierarchy (scikit-learn#9004) * FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit * cleanup * fix deprecation warning + clarify warning * add test * pep8 * adddress comments

add doc + refs + what's new entry (scikit-learn#9052)

b34a60c

FIX OvR/OvO classifier decision_function shape fixes (scikit-learn#9100)

7edca6a

* fix OVR classifier edgecase bugs * add regression tests for OVO and OVR decision function shapes

[MRG + 1] FIX/TST revert scikit-learn#5802 and raise error for faulty…

530c045

… classifier (scikit-learn#9063) * FIX/TST revert scikit-learn#5802 and raise error for faulty classifier * FIX check_estimator take care of the rest

DOCFIX typo & pep8 & shame

bebee73

CIRCLE latexmk is needed to build the pdf doc

821bfef

with sphinx 1.6

DOC: fix links to examples (scikit-learn#9102)

9fb26ba

CIRCLE Revert to sphinx 1.5

841b004

to fix pdf doc generation

Disable pager in git commands

c8f0fe6

Long commit messages can trigger a pager which is not what you want when running flake8_diff.sh in a terminal.

TRAVIS put back all the builds

58ac8ab

now that the sprint is over.

What's new addition and fixes

5006af7

DOC adds auto_ml to related projects (scikit-learn#9042)

1ff8d84

remove identical assert in test_iforest_sparse (scikit-learn#9112)

4a97ebe

FIX improve 'precompute' handling in Lars (scikit-learn#5359)

aef6ff8

Removed force_all_finite array checks in DummyClassifier and DummyReg…

18ccb52

…ressor, added 2 tests (scikit-learn#8931)

[MRG+1] fix StratifiedShuffleSplit with 2d y (scikit-learn#9044)

a768da8

* regression test and fix for 2d stratified shuffle split * strengthen non-overlap sss tests * clarify test and comment * remove iter from tests, use str instead of hash

Add test to verify validation_curve clones estimator

d483263

Simplify test and mock estimator

3382353

Reformat to reduce lines length

ee502a3

Restore mock estimator after rebase

d60527c

Apply changes to the model_selection module

ccd39c3

Update whats_new

80690be

Sundrique closed this Jun 14, 2017

Sundrique deleted the fix-clone-validation-curve-estimator branch June 14, 2017 05:03

Sundrique mentioned this pull request Jun 14, 2017

[MRG+1] Clone estimator for each parameter value in validation_curve #9119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clone estimator for each parameter value in validation_curve #7365

Clone estimator for each parameter value in validation_curve #7365

Sundrique commented Sep 8, 2016

amueller commented Sep 8, 2016

Sundrique commented Sep 8, 2016

amueller commented Sep 8, 2016

jnothman commented Sep 21, 2016 •

edited

Loading

ogrisel commented Sep 21, 2016

Sundrique commented Sep 23, 2016

jnothman commented Sep 26, 2016

jnothman commented Jun 14, 2017

Sundrique commented Jun 14, 2017

jnothman commented Jun 14, 2017 via email

Sundrique commented Jun 14, 2017

Clone estimator for each parameter value in validation_curve #7365

Clone estimator for each parameter value in validation_curve #7365

Conversation

Sundrique commented Sep 8, 2016

amueller commented Sep 8, 2016

Sundrique commented Sep 8, 2016

amueller commented Sep 8, 2016

jnothman commented Sep 21, 2016 • edited Loading

ogrisel commented Sep 21, 2016

Sundrique commented Sep 23, 2016

jnothman commented Sep 26, 2016

jnothman commented Jun 14, 2017

Sundrique commented Jun 14, 2017

jnothman commented Jun 14, 2017 via email

Sundrique commented Jun 14, 2017

jnothman commented Sep 21, 2016 •

edited

Loading