-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pickling slots in subclasses of common classes #70766
Comments
Pickling and copying instances of subclasses of some basic classes pickles and copies instance attributes. Example: >>> class BA(bytearray):
... pass
...
>>> b = BA(b'abc')
>>> b.x = 10
>>> c = copy.copy(b)
>>> c.x
10
>>> c = pickle.loads(pickle.dumps(b))
>>> c.x
10 But this doesn't work if attributes are saved not in instance dictionary, but in slots. >>> class BA(bytearray):
... __slots__ = ('x',)
...
>>> b = BA(b'abc')
>>> b.x = 10
>>> c = copy.copy(b)
>>> c.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> c = pickle.loads(pickle.dumps(b))
>>> c.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x Since using __slots__ is implementation detail, this failure can be considered as a bug. Proposed patch adds support of pickling and copying slots in subclasses of all classes that already support pickling and copying non-slot attributes. It is backward compatible, classes with slots can be unpickled on older Python versions without slots. Affected classes: bytearray, set, frozenset, weakref.WeakSet, collections.OrderedDict, collections.deque, datetime.tzinfo. The patch adds the copyreg._getstate() function for Python classes and exposes the _PyObject_GetState() function for extension classes. An alternative (and simpler for end user) solution would be to add default implementation as object.__getstate__(). But this is not easy to reject non-pickleable classes (bpo-22995) in this case, since __getstate__ is looked up as instance attribute, not as other special methods. |
Synchronized with current sources. |
An alternative way is to expose a default state as object.__getstate__(). It is more efficient since it is implemented in C. Following patch implements this approach. |
This is needed after the changes in python/cpython#70766 Since object now has a __getstate__ method we need to make sure not to call it with any arguments. Probably the __getstate__ methods can be simplified (or removed?) in light of the cpython changes but for now this is a quick fix to restore previous behaviour.
This is needed after the changes in python/cpython#70766 Since object now has a __getstate__ method we need to make sure not to call it with any arguments. Probably the __getstate__ methods can be simplified (or removed?) in light of the cpython changes but for now this is a quick fix to restore previous behaviour.
This is needed after the changes in python/cpython#70766 Since object now has a __getstate__ method we need to make sure not to call it with any arguments. Probably the __getstate__ methods can be simplified (or removed?) in light of the cpython changes but for now this is a quick fix to restore previous behaviour.
Since Python 3.11, objects have a __getstate__ method by default: python/cpython#70766 Therefore, the exception in BaseEstimator.__getstate__ will no longer be raised, thus not falling back on using the object's __dict__: https://github.com/scikit-learn/scikit-learn/blob/dc580a8ef5ee2a8aea80498388690e2213118efd/sklearn/base.py#L274-L280 If the instance dict of the object is empty, the return value will, however, be None. Therefore, the line below calling state.items() results in an error. In this bugfix, it is checked if the state is None and if it is, the object's __dict__ is used (which should always be empty). Not addressed in this PR is how to deal with slots (see also discussion in scikit-learn#10079). When there are __slots__, __getstate__ will actually return a tuple, as documented here: https://docs.python.org/3/library/pickle.html#object.__getstate__ The user would thus still get an indiscriptive error message.
This caused It'd be nice if the 3.11 What's New documentation covered this API change. https://docs.python.org/3/whatsnew/3.11.html#other-language-changes only extolls the change's virtues and does not mention anything about the behavior change consequences and workarounds at the Python and C API levels that people may need to adopt to retain compatibility. |
I don't know what can be suggested as a general workaround. Each problem requires an individual solution, depending on why that code was broken by that change. |
This also creates a bug in pyyaml and ruamel.yaml. It seems that for an empty class, the new getstate method returns None. This causes the serialization to crash. @serhiy-storchaka : Is there a good reason why getstate returns None for an empty class instead of an empty directory ? Minimal code:
prints: |
Because Add |
Accounting for this in PyYAML shouldn't be a big deal- IIUC |
pythonGH-108379) (cherry picked from commit b6be188) Co-authored-by: Gregory P. Smith <greg@krypto.org>
pythonGH-108379) (cherry picked from commit b6be188) Co-authored-by: Gregory P. Smith <greg@krypto.org>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: