Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clone estimator for each parameter value in validation_curve #7365

Conversation

Sundrique
Copy link
Contributor

In validation_curve function an estimator is not cloned like in learning_curve or GridSearchCV.fit for each value of a parameter. Thus it comes pre-trained to each iteration but very first.

@amueller
Copy link
Member

amueller commented Sep 8, 2016

thanks for the fix. Do you think you can create a non-regression test?

@Sundrique
Copy link
Contributor Author

I think, I can if it can wait a couple of days.

@amueller
Copy link
Member

amueller commented Sep 8, 2016

sure, no worries :)

@jnothman
Copy link
Member

jnothman commented Sep 21, 2016

I find your mock estimator a bit of an obtuse way to test this. How about a class inheriting from something already in scikit-learn (even a dummy estimator) that simply raises an error if fit is called on it multiple times?

    def fit(self, X, y):
        assert_false(hasattr(self, 'fit_called_'))
        self.fit_called_ = True
        super(type(self), self).fit(X, y)

@ogrisel
Copy link
Member

ogrisel commented Sep 21, 2016

Please also make sure that the lines you change in this PR follow the PEP8 conventions (travis is read because of this):

https://travis-ci.org/scikit-learn/scikit-learn/jobs/160421906#L457

@Sundrique
Copy link
Contributor Author

Thanks @jnothman, @ogrisel. Will fix.

@jnothman
Copy link
Member

Actually, this change needs to be in model_selection/_validation.py, although I think it's a good idea to backport it to the deprecated module you've currently edited. Otherwise, LGTM. (And I confirm the test fails at master.)

@amueller amueller added this to the 0.19 milestone Oct 27, 2016
@jnothman
Copy link
Member

This only needs a little fixing up to be merged and included in the upcoming release. @Sundrique, are you onto it? Please also add an entry to doc/whats_new.rst

@Sundrique
Copy link
Contributor Author

@jnothman Yes, I am. Will add into model_selection/_validation.py, resolve the conflict and add an entry to doc/whats_new.rst. Please, let me know if anything else is needed.

@jnothman
Copy link
Member

jnothman commented Jun 14, 2017 via email

mathurinm and others added 19 commits June 14, 2017 11:42
* BUG: fix svd_solver validation in PCA.fit

* TST: add test of pca svd_solver
* Modify plot_custom_kernel for matplotlib v2 comp

Add `edgecolors` attribute in scatter plot for better visualization
in matplotlib version 2

Issue: scikit-learn#8364

* Modify plot_oneclass.py for matplotlib v2 comp

Add `edgecolors` attribute to scatter plot for better
visualization in matplotlib version 2

Issue: scikit-learn#8364

* Modify plot_rbf_parameters for matplotlib v2

Add `edgecolors` attribute to scatter plot for
better visualization.

Issue: scikit-learn#8364

* Modify plot_separating_hyperplane_unbalanced for matplotlib v2

Add `edgecolors` attribute to scatter plot for better visualization.

Issue: scikit-learn#8364

* Modify plo_svm_kernels for matplotlib v2

Add `edgecolors` attribute to scatter plot for better
visualization.

Issue: scikit-learn#8364

* Modify plot_svm_margin for matplotlib v2 comp

Add `edgecolors` attribute to scatter plot for better
visualization.

Issue: scikit-learn#8364

* Modify plot_svm_nonlinear for matplotlib v2

Add `edgecolors` attribute to scatter plot for matplotlib
version 2 compatibility

Issue: scikit-learn#8364

* Modify file for remove flake8 error

Remove extra white space.

Issue: scikit-learn#8364
* add html-noplot and help message to make.bat

* changed spaces to tab in make.bat help

* changed all spaces to tabs in make.bat update
…8120)

* Add _RepeatedSplits and RepeatedKFold class

* Add RepeatedStratifiedKFold and doc for repeated cvs

* Change default value of n_repeats

* Change input parameters of repeated cv constructor to n_splits, n_repeats, random_state

* Generate random states in split function rather than store it beforehand

* Doc changes, inheriting RepeatedKFold, RepeatedStratifiedKFold from _RepeatedSplits and other review changes

* Remove blank line, put testcases for deterministic split in loop and add StopIteration check in testcase

* Using rng directly as random_state param to create cv instance and added a check for cvargs

* Fix pep8 warnings

* Changing default values for n_splits and n_repeats and add entry in changelog

* Adding name to the feature

* Missing space
[MRG+2] modify disadvantage
…weight_invariance in metrics/tests/test_common.py (scikit-learn#8537)

* Separated tests for regression features in test_sample_weight_invariance

* Fixed pep8

* Removed unecessary check for regression

* Updated regression metrics

* Joel's suggestions
…mentation (scikit-learn#8548)

* clarify role of the function and streamline introduction

* added feature selection methods to see also

* completed see also

* fixed pep related formatting for flake8checks.

* fixed extra whitespace flake8 problems, remaining failure is a copied see all line from another function, the line is over by a period, does not make sense to newline that.

* one more whitespace

* FIX small pep8 error.
Use pip rather than easy_install in copy_joblib.sh. Also need to remove joblib/testing.py to avoid pytest dependency.
naoyak and others added 26 commits June 14, 2017 11:42
* DOC add hyperlink to example

* Remove useless change

* DOC fix hyperlink

* DOC fix links
* Add deprecation message and test.

* Adding performance warning and ignore_warnings in test

* Add deprecation to whatsnew and remove LSHForest references from docs.

Removing benchmark for lsh
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
* fix OVR classifier edgecase bugs

* add regression tests for OVO and OVR decision function shapes
… classifier (scikit-learn#9063)

* FIX/TST revert scikit-learn#5802 and raise error for faulty classifier

* FIX check_estimator take care of the rest
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation
to fix pdf doc generation
Long commit messages can trigger a pager which is not what you want
when running flake8_diff.sh in a terminal.
now that the sprint is over.
* regression test and fix for 2d stratified shuffle split

* strengthen non-overlap sss tests

* clarify test and comment

* remove iter from tests, use str instead of hash
@Sundrique Sundrique closed this Jun 14, 2017
@Sundrique Sundrique deleted the fix-clone-validation-curve-estimator branch June 14, 2017 05:03
@Sundrique
Copy link
Contributor Author

@jnothman Sorry, screwed up the branch. Created a new one #9119.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment