-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickling of methodcaller, attrgetter, and itemgetter #67144
Comments
methodcaller and attrgetter objects seem to be picklable, but in fact the pickling is erroneous: >>> import operator, pickle
>>> pickle.loads(pickle.dumps(operator.methodcaller("foo")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: methodcaller needs at least one argument, the method name
>>> pickle.loads(pickle.dumps(operator.attrgetter("foo")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: attrgetter expected 1 arguments, got 0 When looking at the pickle disassembly, it seems that the argument to the constructor is indeed not pickled. >>> import pickletools; pickletools.dis(pickle.dumps(operator.methodcaller("foo")))
0: \x80 PROTO 3
2: c GLOBAL 'operator methodcaller'
25: q BINPUT 0
27: ) EMPTY_TUPLE
28: \x81 NEWOBJ
29: q BINPUT 1
31: . STOP
highest protocol among opcodes = 2 |
I think this issue needs different solutions for 3.5 and maintained releases. We can implement the pickling of methodcaller, attrgetter and itemgetter in 3.5 (I agree this is good idea). And it would be good if pickling of these types will raise an exception in maintained releases. |
Note that pickling of the pure Python version of methodcaller works as expected: Python 3.4.2 (default, Nov 20 2014, 12:40:10)
[GCC 4.8.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.modules['_operator'] = None
>>> import operator
>>> import pickle
>>> pickle.loads(pickle.dumps(operator.methodcaller('foo')))
<operator.methodcaller object at 0x7ff869945898> The pure Python attrgetter and itemgetter don't work due to using functions defined in __init__(). 2.7 already raises TypeError on attempts to pickle any of the three. |
+1 for adding pickling support to Python 3.5. I don't see much of a need for any revision to 3.4. |
I've made a patch that I believe should cover all three cases, including tests. In addition to the pickling behavior, I've made two other changes:
Anyone care to review? |
Don't bother reviewing just yet. There is an issue with attrgetter's pickling (which the unit tests caught), and I need to update the pure Python modules to match. |
Okay, this one passes the tests for the built-in module. I'm not sure what's going wrong with the pure Python module. I'm getting the error:
once for each of the three objects. Anyone recognize this? Is this some weird artifact of the multiple imports required to test both pure Python and C versions of the module that I need to work around, or did I make a mistake somewhere else? |
Ah, solved it (I think). The bootstrapper used to import the Python and C versions of the module leaves sys.modules unpopulated (Does pickle itself may populate it when it finds no module of that name?). I added a setUp method to the unittest class for operator that explicitly sets sys.modules['operator'] to whichever version is being tested at the time so pickle's lookup works as expected. Is that the right solution? New patch uploaded with that change. |
I'd prefer to just reimplement itemgetter and attrgetter to make them picklable rather than adding pickling methods to them; see attached patch. I also posted a few comments, but I just went ahead and addressed them myself in this patch. I'm not qualified to give the _operator.c changes a proper review, but they look good enough to me if others agree that __reduce__ is the best approach in C. |
operator.methodcaller is similar to functools.partial which is pickleable and can be used as a sample. In C implementation some code can be shared between __repr__ and __reduce__ methods. As for tests, different protocols should be tested. Also should be tested compatibility between C and Python implementations, instances pickled with one implementation should be unpickleable with other implementation. Move pickle tests into new test class. If add __repr__ methods, they need tests. The restriction of method name type should be tested too. |
That isn't the usual approach. The pickling methods are there for a reason. I prefer to leave the existing code in a stable state and avoid unnecessary code churn or risk introducing bugs into code that is working correctly and as designed. |
Please remember that a potential new pickling feature is the least import part of the design of methodcaller, itemgetter, and attrgetter. Pickle support should be driven by the design rather become a predominant consideration. One other note: the OP's original concern has very little to do with these particular objects. Instead, it is the picking and unpickling tools themselves that tend to have crummy error messages when presented with objects that weren't specially designed with pickle support. |
See bpo-22995 about this. |
Serhiy: functools.partial is a somewhat less than ideal comparison. The pure-Python version is not picklable, the Python and C versions return different things (the Python version is a function returning a function, the C version is a regular class and returns an instance). Also, both versions make their necessary attributes public anyway, unlike methodcaller. Raymond: Not necessarily the usual approach, no. However, I think my reimplementations of the pure-Python itemgetter and attrgetter have a few benefits, namely:
Note that I have no intention of reimplementing the C versions: those are much more mature than the Python versions, and would likely require pickling methods anyway. All that said, I'm not going to fight about it; if I'm overruled, I'm overruled. Josh: Serhiy's points about needing more tests stand; would you like to add them? You can use your patch or mine as a base, depending on how you feel about reimplementing the pure-Python (item|attr}getter. If you use yours, please remember to look through my comments on it. |
Looks as Python version of functools.partial() needs a fix. Reimplementations of the pure-Python itemgetter and attrgetter to automatically pickleable Python classes have a disadvantage. It makes the pickling incompatible between Python and C versions. This means that itemgetter pickled in CPython will be not unpickleable on Python implementation which don't use C accelerator and vice versa. |
Serhiy Storchaka added the comment:
That's a very good point that I hadn't thought about. Consider my |
Here is revised Josh's patch. Added tests for consistency between both implementations, fixed inconsistencies and bugs. I still hesitate about pickling format of methodcaller. First, there is asymmetry between positional and keyword arguments. Second, for now methodcaller is not inheritable, but if it will be in future (as functools.partial is), it would be harder to extend pickling format to support instance attributes. |
methodcaller with keyword arguments pickled with pickle_getter_and_caller3.patch needs Python 3.5 to unpickle. Following patch pickles it in backward compatible way. |
New changeset 435bc22f39e3 by Serhiy Storchaka in branch 'default': |
New changeset c93e5ba1cc20 by Serhiy Storchaka in branch 'default': |
New changeset 2688655e431a by Serhiy Storchaka in branch 'default': |
This is still an issue with operator.attrgetter in 3.4.3, even after clearing sys.modules['_operator']: $ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.modules['_operator'] = None
>>> import operator
>>> import pickle
>>> pickle.loads(pickle.dumps(operator.attrgetter("foo")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <function attrgetter.__init__.<locals>.func at 0x7f25728d5bf8>: attribute lookup func on operator failed |
This is new feature in 3.5. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: