Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breakage in User Code #8316

Closed
HolgerPeters opened this issue Feb 8, 2017 · 3 comments
Closed

Breakage in User Code #8316

HolgerPeters opened this issue Feb 8, 2017 · 3 comments
Labels
Bug

Comments

@HolgerPeters
Copy link
Contributor

@HolgerPeters HolgerPeters commented Feb 8, 2017

With this ticket, I would like to report some breakage in code involving scikit learn.

In commit b4872fe you introduce __getstate__ for the base estimator, in order to annotate it with a scikit-learn version. Problem is, that I have estimator classes that inherit from the base estimator, and also overwrite the __getstate__ method in which some temporary data is excluded from the data being pickled. The definition of the alternate __getstate__ is done via a mixin class, from which my estimator inherits. This estimator also inherits from scikit learns BaseEstimator. Now with b4872fe BaseEstimator defines __getstate__ which overwrites the logic in my variant of __getstate__.

So as a rough sketch, the situation is as follows:

class MyMixin(object):
    # ...
    def __getstate__(self):
        values_for_serialization = self.__dict__.copy()
        values_for_serialization['expert_'] = None
        return values_for_serialization

class MyEstimator(BaseEstimator, MyMixin):
    # ...

Of course I can fix this in my codebase (and as such, I do not request an upstream fix), but I wanted to communicate what kind of subtleties overwriting standard methods can bring.

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 8, 2017

We tried to limit its effect with the
if type(self).__module__.startswith('sklearn.'):
but I think we should have also used return super(...).__getstate__() instead of return dict(self.__dict__.items()) to proceed up the MRO in the case of multiple inheritance. Would you be able to confirm that this fixes your case? Would you like to submit a pull request?

@jnothman jnothman added the Bug label Feb 8, 2017
@jnothman
Copy link
Member

@jnothman jnothman commented Feb 8, 2017

And you may not request an upstream fix, but I consider this a bug. I think we are not often enough aware of multiple inheritance/MRO issues, not just involved in overwriting standard methods.

@HolgerPeters
Copy link
Contributor Author

@HolgerPeters HolgerPeters commented Feb 8, 2017

I will prepare a pull request. Glad to help.

@jnothman jnothman closed this Feb 20, 2017
jnothman added a commit that referenced this issue Feb 20, 2017
…8324)

Fixes #8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
sergeyf added a commit to sergeyf/scikit-learn that referenced this issue Feb 28, 2017
…cikit-learn#8324)

Fixes scikit-learn#8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
Sundrique added a commit to Sundrique/scikit-learn that referenced this issue Jun 14, 2017
…cikit-learn#8324)

Fixes scikit-learn#8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
NelleV added a commit to NelleV/scikit-learn that referenced this issue Aug 11, 2017
…cikit-learn#8324)

Fixes scikit-learn#8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
paulha added a commit to paulha/scikit-learn that referenced this issue Aug 19, 2017
…cikit-learn#8324)

Fixes scikit-learn#8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this issue Nov 15, 2017
…cikit-learn#8324)

Fixes scikit-learn#8316


* Don't use test classes to group tests

* only use formatting for parts of the string that change

* Flake 8 column limit

* Make the modification of the estimator more explicit in the tests

* As suggested in code review, prefer formatting over two literals

* Also assert, that __setstate__ overwriting works in mixin

* Remove cache property

* Use assertion functions from sklearn.utils.testing

* remove the protocol argument in tests

* Rename attributes to better convey their purpose

* Revert change of module in TreeNoVersion

* Adhere to column-limit

* changelog entry

* Fix commit message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants
You can’t perform that action at this time.