New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 1] Revive "don't set trailing _ attrs in __init__" PR #7464

Merged
merged 9 commits into from Dec 12, 2016

Conversation

@lesteve
Member

lesteve commented Sep 21, 2016

Reference Issue

This PR revives and fixes #3937.


This change is Reviewable

@raghavrv

This comment has been minimized.

Member

raghavrv commented Sep 21, 2016

Thanks a lot for reviving the PR!

@@ -120,6 +121,62 @@ def test_class_weight_balanced_linear_classifiers():
name), name, Classifier
# These are the known trailing-_ properties on current estimators.
#
# Such properties cause hasattr(estimator, "param_") to return True even if

This comment has been minimized.

@jnothman

jnothman Sep 21, 2016

Member

This is not true. As long as the property call raises an error, hasattr will return false. You should be able to remove most of these exceptions and fix most others.

Besides which, I think these exceptions would be better identified by inspection than hard-coded list.

This comment has been minimized.

@lesteve

lesteve Sep 22, 2016

Member

OK it was done like this in the original PR. I'll have a closer look.

@lesteve lesteve changed the title from [WIP] Revive "don't set trailing _ attrs in __init__" PR to [MRG] Revive "don't set trailing _ attrs in __init__" PR Sep 22, 2016

@@ -482,8 +482,8 @@ def _validate_for_predict(self, X):
@property
def coef_(self):
if self.kernel != 'linear':
raise ValueError('coef_ is only available when using a '
'linear kernel')
raise AttributeError('coef_ is only available when using a '

This comment has been minimized.

@lesteve

lesteve Sep 22, 2016

Member

This is one of the potentially controversial change.

From the comment in #3937 (comment):

The exception is where it would be breaking backwards compatibility to, for example, change an existing error from a ValueError to an AttributeError (although perhaps we could define a custom BackwardsCompatibleAttributeError that extends ValueError and AttributeError!).

Should we do that for this one case? Also just curious, if we do that is there a clever deprecation mechanism that allows to turn it into an AttributeError after two major releases?

This comment has been minimized.

@jnothman

jnothman Sep 26, 2016

Member

For similar reasons we have defined class NotFittedError(ValueError, AttributeError):!

I suspect this is not a major concern as we don't expect this to be handled programmatically as a ValueError... but of course users' expectations may differ...

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Sep 27, 2016

Member

Should we define a class ParameterError(ValueError, AttributeError)?

It would solve the issue of forward compatibility. I am a bit worried that too many exception classes will lead to these not being used. They need to be documented well.

This comment has been minimized.

@lesteve

lesteve Sep 27, 2016

Member

Note this is the only case that I can see where ParameterError would be used. As @jnothman said above, I wouldn't expect this particular error (trying to access coef_ when kerner != 'linear') to be handled programatically, and if you do it's straightforward to adapt your code in a not so invasive manner by using except (ValueError, TypeError).

This comment has been minimized.

@jnothman

jnothman Sep 27, 2016

Member

I thought I had suggested ParamDependentAttributeError (which makes sense in terms of being a ValueError-dependent AttributeError). Still, it seems analogous to too much red tape.

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

I think the change is fine as-is.

@lesteve

This comment has been minimized.

Member

lesteve commented Sep 26, 2016

@jnothman another round of review if you have some spare bandwith ?

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 26, 2016

I'm okay with this as a test and as fixes for tests (plus some additional check_is_fitteds). Some user code may have depended on these initialisations, and we need to be a little thoughtful to that case. One option is to merely document this change clearly in what's new. Another option is to use descriptors which raise a deprecation warning when the attrribute is retrieved or hasattred prior to being set. I don't think we've ever been this careful before, but I also see no reason not to be.

@lesteve

This comment has been minimized.

Member

lesteve commented Sep 27, 2016

Some user code may have depended on these initialisations, and we need to be a little thoughtful to that case.

Out of curiosity, do you have a use case in mind or is it just a theoretical possibility for now?

One option is to merely document this change clearly in what's new.

That's the option which would have my vote. the descriptor option seems a little bit like overkill to me, especially if there is no precise use case that we can't think of where the attribute would be accessed or modified prior to the fit.

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 27, 2016

Just a possibility, e.g. someone checking if an estimator is fitted by looking for an attribute valued None.

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 27, 2016

That approval was accidental, sorry.

@lesteve

This comment has been minimized.

Member

lesteve commented Sep 27, 2016

Just a possibility, e.g. someone checking if an estimator is fitted by looking for an attribute valued None.

Although certainly possible, it doesn't seem that likely and there is an easy fix in case people do use it. Just not entirely convinced this edge case is worth bothering about the descriptor solution. Out of completeness though, how would the descriptor solution look like, something like this?

def __init__(...):
    ...
    self._params_ = None

@property
def param_(self):
    try:
        check_fitted(self, 'blessed_param_')
    except NotFittedError:
        # deprecation warning saying do not access 'param_' before fit.
        # That will raise an error in 0.21
    return self._param_

@param.setter
def setter(self, value):
    try:
        check_fitted(self, 'blessed_param_')
    except NotFittedError:
        # deprecation warning
    self._param_ = value

The setter would need some tweaking since we are setting the attribute prior to the fit sometimes in our code (maybe we can do the setting through the private ._param_ in this particular case).

@lesteve

This comment has been minimized.

Member

lesteve commented Sep 27, 2016

Also thinking about it, we would need to exclude these special parameters from the tests ...

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 27, 2016

how about something like

class DeprecatedInitializer(object):
    def __init__(self, name, default=None):
        self.name = name
        self.default = default

    def __get__(self, obj):
        if not hasattr(obj, '_val_' + self.name):
            warnings.warn(...)
        return getattr(obj, '_val_' + self.name)

    def __set__(self, obj, val):
        setattr(obj, '_val_' + self.name, self.name)


class MyEstimator(object):
    attr_ = DeprecatedInitializer('attr_', default=None)

?

Still not sure it's the best idea, but just so you can see what it might look like. Why does it need excluding from tests?

@amueller amueller added this to the 0.19 milestone Sep 29, 2016

@amueller

This comment has been minimized.

Member

amueller commented Sep 29, 2016

Looks good apart from minor comments. I think the logic in SGDClassifier is to complicated right now.

self._allocate_parameter_mem(n_classes, n_features,
coef_init, intercept_init)
elif n_features != self.coef_.shape[-1]:
raise ValueError("Number of features %d does not match previous "
"data %d." % (n_features, self.coef_.shape[-1]))
self.loss_function = self._get_loss_function(loss)
if self.t_ is None:
if getattr(self, "t_", None) is None:

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

When would self.t_ be None? Shouldn't this just be hasattr?

@@ -383,7 +373,7 @@ def _partial_fit(self, X, y, alpha, C,
def _fit(self, X, y, alpha, C, loss, learning_rate, coef_init=None,
intercept_init=None, sample_weight=None):
if hasattr(self, "classes_"):
if getattr(self, "classes_", None) is not None:

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

I don't understand this. We're trying to reset, right? Why not

if hasattr(self, "classes_"):
    del self.classes_

This comment has been minimized.

@amueller

amueller Oct 5, 2016

Member

did you reply to this and github swallowed it?

This comment has been minimized.

@lesteve

lesteve Oct 5, 2016

Member

Nope haven't replied to this one (yet).

@@ -393,7 +383,7 @@ def _fit(self, X, y, alpha, C, loss, learning_rate, coef_init=None,
# np.unique sorts in asc order; largest class id is positive class
classes = np.unique(y)
if self.warm_start and self.coef_ is not None:
if self.warm_start and getattr(self, "coef_", None) is not None:

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

why would self.coef_ be None?

@@ -482,8 +482,8 @@ def _validate_for_predict(self, X):
@property
def coef_(self):
if self.kernel != 'linear':
raise ValueError('coef_ is only available when using a '
'linear kernel')
raise AttributeError('coef_ is only available when using a '

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

I think the change is fine as-is.

@@ -120,6 +121,13 @@ def test_class_weight_balanced_linear_classifiers():
name), name, Classifier
def test_no_fit_attributes_set_in_init():

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

I don't think this is the right way. you should add it to the standard tests.

This comment has been minimized.

@lesteve

lesteve Oct 3, 2016

Member

Will do, this is how it was done in the original PR.

This comment has been minimized.

@lesteve
@@ -1347,6 +1348,21 @@ def check_estimators_overwrite_params(name, Estimator):
% (name, param_name, original_value, new_value))
def check_no_fit_attributes_set_in_init(name, Estimator):

This comment has been minimized.

@amueller

amueller Sep 29, 2016

Member

add this to _yield_all_checks

This comment has been minimized.

@lesteve
@ogrisel

ogrisel approved these changes Oct 6, 2016

Besides the following two comments, this LGTM.

@@ -692,7 +690,7 @@ def fit(self, X, y, check_input=True):
if self.selection not in ['cyclic', 'random']:
raise ValueError("selection should be either random or cyclic.")
if not self.warm_start or self.coef_ is None:
if not self.warm_start or getattr(self, "coef_", None) is None:

This comment has been minimized.

@ogrisel

ogrisel Oct 6, 2016

Member

Why not just hasattr(self, 'coef_') instead?

This comment has been minimized.

@lesteve

lesteve Nov 5, 2016

Member

There is some failure if I do that in sklearn/decomposition/tests/test_dict_learning.py that I haven't fully understood.

This comment has been minimized.

@jnothman

jnothman Nov 8, 2016

Member

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/dict_learning.py#L125 is the problem. It sets coef_ directly before fitting. You can either fix that to only set coef_ if init is not None, or continue with the approach used here.

This comment has been minimized.

@jnothman

jnothman Nov 8, 2016

Member

I don't think in general we should allow coef_ = None to be set before a warm_start fit; better that it explicitly be set to zeroes. But there may be user code that does so, copying the dict learning code.

This comment has been minimized.

@lesteve

lesteve Nov 17, 2016

Member

You can either fix that to only set coef_ if init is not None, or continue with the approach used here.

I chose the former, i.e. set coef_ only if init is not None

@@ -1029,7 +1019,7 @@ def _fit_regressor(self, X, y, alpha, C, loss, learning_rate,
penalty_type = self._get_penalty_type(self.penalty)
learning_rate_type = self._get_learning_rate_type(learning_rate)
if self.t_ is None:
if getattr(self, "t_", None) is None:

This comment has been minimized.

@ogrisel

ogrisel Oct 6, 2016

Member

why not not hasattr(self, 't_') instead? If this is to support performing a warm started fit on a unpickled previously trained with a past version of scikit-learn, we should add a comment to make it explicit.

This comment has been minimized.

@ogrisel

ogrisel Oct 6, 2016

Member

I don't think we should worry about that though.

@jnothman

This comment has been minimized.

Member

jnothman commented Nov 1, 2016

It looks like this is close to merge, @lesteve. Let's not leave it lying around too long :)

@lesteve

This comment has been minimized.

Member

lesteve commented Nov 2, 2016

Yeah sorry, I'll do my best to get it merged in the next few days.

@lesteve

This comment has been minimized.

Member

lesteve commented Nov 15, 2016

I think I tackled all the comments in this one, @amueller @jnothman and @ogrisel maybe you can have a look?

@jnothman

Otherwise LGTM. Please write a what's new entry.

# as the property getter raises an AttributeError
assert_false(
hasattr(estimator, attr),
"Fitted attributes ending with '_' should not be initialized "

This comment has been minimized.

@jnothman

jnothman Nov 15, 2016

Member

This is not accurate for attributes starting _.

This comment has been minimized.

@lesteve

lesteve Nov 17, 2016

Member

I tweaked slightly the error message and the check. We exclude special methods __bla__ from the checks and I am not sure this is worth mentioning in the error mesage.

@@ -88,8 +88,7 @@ def __init__(self, base_estimator, n_estimators=10,
# Don't instantiate estimators now! Parameters of base_estimator might
# still change. Eg., when grid-searching with the nested object syntax.
# This needs to be filled by the derived classes.
self.estimators_ = []

This comment has been minimized.

@jnothman

jnothman Nov 15, 2016

Member

I wonder if we need to deprecate this, since it is mutable and some ensembles support a warm_start that could be interpreted as pre-populating this list. However, doing the deprecation right could be messy. I vote for documenting this change in particular in what's new instead.

This comment has been minimized.

@lesteve

lesteve Nov 17, 2016

Member

I have added an entry in whats_new, let me know if that doesn't meet your expectations, or even edit the file directly in this PR.

@amueller

Looks good apart from minor comments :)

in the constructor but only in the ``fit`` method. Most notably
if you implemented your own estimator deriving from
:class:`ensemble.BaseEnsemble` you may be affected by this if you
were using `self.estimators_` in your constructor.

This comment has been minimized.

@amueller

amueller Nov 17, 2016

Member

maybe "are using" ? also double backticks, right?

@@ -355,15 +345,15 @@ def _partial_fit(self, X, y, alpha, C,
self.classes_, y)
sample_weight = self._validate_sample_weight(sample_weight, n_samples)
if self.coef_ is None or coef_init is not None:
if getattr(self, "coef_", None) is None or coef_init is not None:

This comment has been minimized.

@amueller

amueller Nov 17, 2016

Member

Can it still happen that coef_ is None? Or is it always "not present"? Then you could use hasattr... I find attributes with trailing underscores that are None somewhat odd.

This comment has been minimized.

@lesteve

lesteve Nov 17, 2016

Member

I tried to replace this with hasattr but there are multiple places in the scikit-learn code that sets coef_ to None. I can try again to see how hard it is.

This comment has been minimized.

@jnothman

jnothman Nov 17, 2016

Member

This is sometimes how we have internally initialised warm_start models.

This comment has been minimized.

@lesteve

lesteve Dec 9, 2016

Member

I can try again to see how hard it is.

I have to say I did not manage to find time to look at that closer. Would it be worth merging this PR anyway ?

This comment has been minimized.

@jnothman

jnothman Dec 10, 2016

Member

Yes, I'm happy to merge it. It's got 3 +1s

@amueller amueller changed the title from [MRG] Revive "don't set trailing _ attrs in __init__" PR to [MRG + 1] Revive "don't set trailing _ attrs in __init__" PR Nov 17, 2016

in the constructor but only in the ``fit`` method. Most notably
if you implemented your own estimator deriving from
:class:`ensemble.BaseEnsemble` you may be affected by this if you
are using ``self.estimators_`` in your constructor.

This comment has been minimized.

@jnothman

jnothman Nov 17, 2016

Member

or even in fit or even in random user code. This can also be true of someone selecting trees from a forest and putting them together into a new ensemble by extending a clean forest...

This comment has been minimized.

@jnothman

jnothman Dec 10, 2016

Member

I wouldn't mind if this were a bit more precise before merge, though

This comment has been minimized.

@lesteve

lesteve Dec 12, 2016

Member

Feel free to edit this the way you want!

larsmans and others added some commits Dec 4, 2014

MAINT/TST: don't set trailing "_" attrs in __init__
NOTE: code uses getattr(self, "someattr_", None) is None instead
of hasattr, in an attempt to be backwards compatible with pickled
estimators that have these attributes.
Require that hasattr(estimator, something_) is always False before fit
In particular for properties ending with an underscore, that means that
their getter must raise an AttributeError.

Use check_is_fitted where appropriate.

@jnothman jnothman merged commit e542efa into scikit-learn:master Dec 12, 2016

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@jnothman

This comment has been minimized.

Member

jnothman commented Dec 12, 2016

Thanks, @lesteve!

@lesteve lesteve deleted the lesteve:larsmans-test-fit-attributes-in-init branch Dec 12, 2016

@lesteve

This comment has been minimized.

Member

lesteve commented Dec 12, 2016

Great, I am glad I got this one merged. Thanks a lot for the reviews @jnothman, @amueller and @ogrisel!

YHYbetterlife added a commit to YHYbetterlife/scikit-learn that referenced this pull request Dec 12, 2016

'estimated from data in scikit-learn. Consequently they '
'should not be initialized in the constructor of an '
'estimator but in the fit method. Attribute {0!r} '
'was found in estimator {1}'.format(estimator, attr))

This comment has been minimized.

@amueller

amueller Dec 12, 2016

Member

these are the wrong way around ;) Wanna fix it real quick @lesteve ?

This comment has been minimized.

@lesteve

lesteve Dec 13, 2016

Member

Ooop, sorry about that. I'll fix that today.

This comment has been minimized.

@lesteve

lesteve Dec 13, 2016

Member

Fixed in 5007e02. I added a test while I was at it.

This comment has been minimized.

@amueller

amueller Dec 13, 2016

Member

sweet, no worries :)

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment