Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Instance level common tests #9019

Merged
merged 37 commits into from Jun 9, 2017

Conversation

amueller
Copy link
Member

@amueller amueller commented Jun 6, 2017

This is pulling out another part of #8022: the instance level common tests.
This doesn't give a huge boon by itself, but allows better testing of meta-estimators, because now check_estimators is much more flexible!

Mainly #8022 is still insanely big and I though it might be a good idea to split it up.

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

ping @jnothman @GaelVaroquaux @agramfort @vene ;)

@@ -1608,16 +1634,16 @@ def param_filter(p):
assert_equal(param_value, init_param.default)


def multioutput_estimator_convert_y_2d(name, y):
def multioutput_estimator_convert_y_2d(estimator, y):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is to make the other patch simpler...

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

with the tags PR I will simplify test_common a bit and basically just call check_estimator on all estimators, hopefully ;)

@amueller amueller changed the title Instance level common tests [MRG] Instance level common tests Jun 6, 2017
Copy link
Member

@vene vene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note to self: got up to line ~650)

@@ -204,3 +204,4 @@ def __init__(self):
check_no_fit_attributes_set_in_init,
'estimator_name',
NonConformantEstimator)
check_estimators_unfitted("estimator", CorrectNotFittedErrorClassifier())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge error: line 190 duplicated here


errmsg = "Input contains NaN, infinity or a value too large for " \
"dtype('float64')."
try:
Estimator().fit(X, y)
estimator.fit(X, y)
except ValueError as e:
if str(e) != errmsg:
raise ValueError("Estimator {0} raised warning as expected, but "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not this PRs fault, but) ambiguous message here: raised warning or error?

yield _named_check(check, name), name, Estimator
estimator = Estimator()
# check this on class
yield check_no_fit_attributes_set_in_init, name, Estimator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not tested for meta estimators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the tests for meta-estimators are not very strong. They will be much stronger with the estimator tags branch merged. I tried to keep this PR small, it doesn't increase test coverage much, but it doesn't decrease it anywhere afaik.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coverage is a good sanity check for this indeed. Sounds good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still be using _named_check, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just reframe check_no_fit_attributes_set_in_init as check_no_fit_attributes_set_upon_clone, perhaps we can get away with running it on an instance too.

@@ -244,11 +253,20 @@ def check_estimator(Estimator):
Class to check. Estimator is a class object (not an instance).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be an instance now :)

@@ -314,7 +332,7 @@ def set_testing_parameters(estimator):
# of components of the random matrix projection will be probably
# greater than the number of features.
# So we impose a smaller number (avoid "auto" mode)
estimator.set_params(n_components=1)
estimator.set_params(n_components=8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

necessary now that the dictlearning bug is fixed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although 1 is very edge-casey here so I'm not opposed to this change even if not strictly necessary. Maybe 2 or so to shave off a few ms

rng = np.random.RandomState(0)
X = rng.rand(40, 10)
X[X < .8] = 0
X_csr = sparse.csr_matrix(X)
y = (4 * rng.rand(40)).astype(np.int)
y = multioutput_estimator_convert_y_2d(estimator, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was the absence of this line a bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Hm also why do we need the scaler special cases down there... huh...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scaling/standardizing when with_mean=True raises an exception rather than densifying the data. I think the decision was that users would ignore warnings and would blow up the memory...

Really the tags for this estimator will be tricky too. Technically with the default args it does not support sparse inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I thought we deprecated that behavior when we introduced MaxAbsScaler but that was as an alternative for MinMaxScaler, not StandardScaler, my bad

# to not check deprecated classes
return
rnd = np.random.RandomState(0)
X = 3 * rnd.uniform(size=(20, 3))
y = X[:, 0].astype(np.int)
y = multioutput_estimator_convert_y_2d(name, y)
estimator = Estimator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not cloned here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@jnothman
Copy link
Member

jnothman commented Jun 7, 2017 via email

if args[0] == "self":
# if_delegate_has_method makes methods into functions
# with an explicit "self", so need to shift arguments
args = args[1:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that was pretty subtle! For reference I made this gist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if some clever ass names the first argument something else than self?

It's bad style, and we would probably complain reviewing that code, so I don't think that we should cater for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this only happens for methods that are passed through @if_delegate_has_method, maybe we could make that decorator raise an error unless its first attribute is called "self", just to avoid subtle breakage in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open issue?

@vene
Copy link
Member

vene commented Jun 7, 2017

Most of this PR consists of changes of the form:

def f(Estimator):

with ignore_warnings(category=DeprecationWarning):
   estimator = Estimator()
   ...

into:

@ignore_warnings(category=DeprecationWarning):
def f(estimator):
    estimator = clone(estimator)
   ...

I checked and we cannot remove the ignore_warnings outright, because cloning a deprecated estimator re-raises the deprecation warning.

However, this has the consequence of possibly ignoring more deprecations, because the coverage is wider. This way we might miss, in the future, if we deprecate e.g. StandardScaler or something and it is used in the common tests.

set_testing_parameters(classifier)
# try to fit
try:
classifier.fit(X_train, y)
except ValueError as e:
if 'class' not in repr(e):
print(error_string_fit, Classifier, e)
print(error_string_fit, classifier, e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this print classifier.__class__ or similar, or is this ok?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine, but I can also change it if you like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine, printout might be more useful the new way.

@@ -1635,13 +1662,13 @@ def check_non_transformer_estimators_n_iter(name, Estimator):

# LassoLars stops early for the default alpha=1.0 the iris dataset.
if name == 'LassoLars':
estimator = Estimator(alpha=0.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not the fault of this PR but) Just curious why can't we check if n_iter_ is at least 0 instead. LassoLars is doing the right thing: on some datasets there is no work to be done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to open an issue ;)

return

else:
e = estimator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this logic will move to the caller of this code instead. Nice!

# Check whether an estimator having both decision_function and
# predict_proba methods has outputs with perfect rank correlation.

centers = [(2, 2), (4, 4)]
X, y = make_blobs(n_samples=100, random_state=0, n_features=4,
centers=centers, cluster_std=1.0, shuffle=True)
X_test = np.random.randn(20, 2) + 4
estimator = Estimator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimator is not cloned here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@amueller
Copy link
Member Author

amueller commented Jun 7, 2017

However, this has the consequence of possibly ignoring more deprecations, because the coverage is wider. This way we might miss, in the future, if we deprecate e.g. StandardScaler or something and it is used in the common tests.

We haven't been consistent with how we ignore warnings in the common tests, and some common test outright ignore all warnings. I think we should not worry about deprecations in the common tests for now.

@vene
Copy link
Member

vene commented Jun 7, 2017

re:deprecations, sounds good, and we shouldn't let it slow this PR down.

Other than the couple of missing clones and the few questions I left, this LGTM.

rng = np.random.RandomState(0)
X = rng.rand(40, 10)
X[X < .8] = 0
X_csr = sparse.csr_matrix(X)
y = (4 * rng.rand(40)).astype(np.int)
y = multioutput_estimator_convert_y_2d(estimator, y)
for sparse_format in ['csr', 'csc', 'dok', 'lil', 'coo', 'dia', 'bsr']:
X = X_csr.asformat(sparse_format)
# catch deprecation warnings
with ignore_warnings(category=DeprecationWarning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here you didn't move the ignore_warnings to the top

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but should be fine, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep just triggered my mindless pattern-matching review process :P

@amueller
Copy link
Member Author

amueller commented Jun 7, 2017

I think I addressed everything. I renamed the check parameters to estimator_org to make sure it's cloned every time.

@amueller
Copy link
Member Author

amueller commented Jun 7, 2017

@jnothman this should actually be "easy" to review ;)

@vene
Copy link
Member

vene commented Jun 7, 2017

I'm happy now.

@vene vene changed the title [MRG] Instance level common tests [MRG+1] Instance level common tests Jun 7, 2017
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any difference in runtime due to all the cloning?

You need to add tests to ensure that check_estimator still works on both class and instance.

You should also add a test to check that the object passed into check_estimator is completely unchanged afterwards. This will help to ensure that all current and any future checks include a clone.

I'm sorry if you were hoping for me to give you a green light!

yield _named_check(check, name), name, Estimator
estimator = Estimator()
# check this on class
yield check_no_fit_attributes_set_in_init, name, Estimator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still be using _named_check, no?

yield _named_check(check, name), name, Estimator
estimator = Estimator()
# check this on class
yield check_no_fit_attributes_set_in_init, name, Estimator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just reframe check_no_fit_attributes_set_in_init as check_no_fit_attributes_set_upon_clone, perhaps we can get away with running it on an instance too.

@@ -319,6 +319,11 @@ API changes summary
now only have ``self.estimators_`` available after ``fit``.
:issue:`7464` by `Lars Buitinck`_ and `Loic Esteve`_.

- All checks in ``utils.estimator_checks``, in particular
:func:`utils.estimator_checks.check_estimator` now accept estimator
instances. Checks other than ``check_estimator`` do not accept
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently untrue for check_no_fit_attributes_set_in_init.

name = Estimator.__name__
check_parameters_default_constructible(name, Estimator)
for check in _yield_all_checks(name, Estimator):
if isinstance(Estimator, type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely certain this is the right check, but my attempts to break it have failed so far.

Copy link
Member

@vene vene Jun 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my knowledge and to a quick Google, this is the right way to do it.

def _yield_non_meta_checks(name, Estimator):
def assert_almost_equal_dense_sparse(x, y, decimal=6, err_msg=''):
if sparse.issparse(x):
assert_array_almost_equal(x.data, y.data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fwiw, assert_allclose may be preferable...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name's shorter? (i.e. not obscenely verbose) :D

More seriously, because it allows control in terms of absolute and relative tolerance, which allows you to compare very large numbers, very small numbers, etc., while decimals is irrelevant to the internal number format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I've seen it said somewhere. And I can't recall the arguments put forth, but I like the brief name, and I like that we're able to use rtol when appropriate, and it's good practice to be thinking about whether absolute or relative tolerance is the issue for a particular case.

@@ -314,7 +336,7 @@ def set_testing_parameters(estimator):
# of components of the random matrix projection will be probably
# greater than the number of features.
# So we impose a smaller number (avoid "auto" mode)
estimator.set_params(n_components=1)
estimator.set_params(n_components=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_testing_params becomes problematic now. We will potentially hide bugs by overriding explicit tests of parameter settings. We should only override these if they are currently at their default values.

Thanks for incidentally changing something in this function so that I was reminded that it was an issue!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it become problematic now, wasn't it problematic to begin with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because before we were changing defaults from what was most useful for a user to what is most practical in testing. Now if a developer explicitly passes a non-default instantiation, we're ignoring their wishes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggestion how to resolve that? I hadn't thought about that but agree

Copy link
Member

@vene vene Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great point. only override a param if it is different not different from the default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, putting this in test_classifiers or whatever might make sense. Hopefully we don't break too many downstream libraries using check_estimator this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we use an ugly hack, like checking if the estimator comes from sklearn ;) but that wouldn't allow us to specify parameters safely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that requires adding a use_fast_parameters to all of the checks, right?
Alternatively, we could get rid of the function, and have each estimator declare its testing parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no wait, I got it...

Copy link
Member

@ogrisel ogrisel Jun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -337,20 +359,24 @@ def _is_32bit():
return struct.calcsize('P') * 8 == 32


def check_estimator_sparse_data(name, Estimator):
def check_estimator_sparse_data(name, estimator_org):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who has worked too much on Named Entity Recognition, I can't read "org" as anything but "organisation". Can we use estimator_orig or base_estimator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_estimator means something different to me. I orig is fine for me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for orig



def check_estimators_partial_fit_n_features(name, Alg):
@ignore_warnings(category=DeprecationWarning)
def check_estimators_partial_fit_n_features(name, alg_org):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename alg to estimator?

def check_no_fit_attributes_set_in_init(name, Estimator):
"""Check that Estimator.__init__ doesn't set trailing-_ attributes."""
# STILL ON CLASSES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the world of scikit-learn <= 0.18 is forgotten, this comment will look pretty obscure. How about "Takes a class rather than an estimator instance as input"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it's nice to have fewer comments that yell at the reader 😛

@@ -1547,6 +1584,7 @@ def check_estimators_data_not_an_array(name, Estimator, X, y):


def check_parameters_default_constructible(name, Estimator):
# THIS ONE IS STILL ON CLASSES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be doing type(estimator)() instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, but I want to support non-default constructible estimators in check_estimator, like GridSearchCV.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I guess. At some point in the future, I think we will want a way to specify testing parameters for non-default-constructables.

@amueller
Copy link
Member Author

amueller commented Jun 7, 2017

@jnothman I was hoping for you to nag as much as possible ;) there are some tests on instances and some on classes but I should make that more explicit.

@amueller
Copy link
Member Author

amueller commented Jun 9, 2017

should be good now.

@GaelVaroquaux
Copy link
Member

LGTM. +1 for merge.

Does anybody else wants to comment, or should we merge?

@vene
Copy link
Member

vene commented Jun 9, 2017

I should have either tested locally on old scipy, or you should set up CI for prs onto your fork 😛

in my defence I did check with modern scipy.

e = Estimator()
set_testing_parameters(e)
def check_estimators_empty_data_messages(name, estimator_orig):
e = clone(estimator_orig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e -> estimator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh, I just saw that there is also est all over the place. Nevermind.

@vene
Copy link
Member

vene commented Jun 9, 2017

genuine travis failure on py27 ubuntu; RandomForest fails the pickle test it seems

@vene vene mentioned this pull request Jun 9, 2017
3 tasks
fix flake8 error and shorten
@amueller
Copy link
Member Author

amueller commented Jun 9, 2017

can't reproduce :-/

@amueller
Copy link
Member Author

amueller commented Jun 9, 2017

image
is what's happening on travis (top before, bottom after)

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -314,7 +336,7 @@ def set_testing_parameters(estimator):
# of components of the random matrix projection will be probably
# greater than the number of features.
# So we impose a smaller number (avoid "auto" mode)
estimator.set_params(n_components=1)
estimator.set_params(n_components=2)
Copy link
Member

@ogrisel ogrisel Jun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -870,23 +883,22 @@ def check_estimators_nan_inf(name, Estimator):
for X_train in [X_train_nan, X_train_inf]:
# catch deprecation warnings
with ignore_warnings(category=DeprecationWarning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now caught at the check function level.

@amueller amueller merged commit d788dcb into scikit-learn:master Jun 9, 2017
@jnothman
Copy link
Member

jnothman commented Jun 10, 2017 via email

@amueller
Copy link
Member Author

@jnothman thanks and yeah, for sure. I'd like to get #8022 in, so we can actually give contrib devs useful advice. Right now they still might not be able to test estimators.

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
* start work on separating instance-level tests

* minor refactoring / fixes to work without tags

* add clone into check_supervised_y_2d estimator check (which made other checks fail)

* remove duplicate check_estimator_unfitted assert

* add issue reference to whatsnew entry

* added some clones, minor fixes from vene's review

* rename estimator arg to estimator_org to make a visible distinction before and after cloning.

* more renaming for more explicit clones

* org -> orig

* allclose, fix orig stuff

* don't use set_testing_parameters in the checks!

* minor fixes for allclose

* fix some test, add more tests on classes

* added the test using pickles.

* move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it

* make assert_allclose_dense_sparse more stringent

* more allclose fixes

* run test_check_estimator on all estimators

* rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it)

* fix in set_checking_parameters so that common tests pass

* more fixes to assert_allclose_dense_sparse

* rename alg to clusterer, don't scream even though I really want to

* ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed.

* simplify test as they didn't help at all

* it works!!! omfg

* run check_estimator clone test only on one of the configs, don't run locally by default

* Add `slow_test` decorator and documentation

* run test_check_estimator only on some estimators

* fix diags in test for older scipy

* fix pep8 and shorten

* use joblib.hash for inequality check because the pickle state machine is weird
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants