-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pure Python operator module #60898
Comments
(Brett, I've made you nosy due to the relation to bpo-16651.) Here is a pure Python implementation of the operator module, or at least a first draft thereof :). I'm attaching the module itself, as well as a patch to integrate it. Any and all review is quite welcome. I'm confident in the fact that the module as it stands passes all current tests, but how it gets there is entirely up for debate (namely, the attrgetter, itemgetter, and methodcaller classes, as well as length_hint(), countOf(), and indexOf()). Note that there's also a change to hmac.py; _compare_digest() in operator.c doesn't seem to have any relation to the rest of the module (see bpo-15061 discussion) and is private anyway, so operator.py doesn't go near it. hmac.py has to import directly from _operator. Thanks, Zach Ware |
Here is a functional (and more effective) equivalent of attrgetter: def attrgetter(attr, *attrs):
"""
Return a callable object that fetches the given attribute(s) from its operand.
After f=attrgetter('name'), the call f(r) returns r.name.
After g=attrgetter('name', 'date'), the call g(r) returns (r.name, r.date).
After h=attrgetter('name.first', 'name.last'), the call h(r) returns
(r.name.first, r.name.last).
"""
if not attrs:
if not isinstance(attr, str):
raise TypeError('attribute name must be a string')
names = attr.split('.')
def func(obj):
for name in names:
obj = getattr(obj, name)
return obj
return func
else:
getters = tuple(map(attrgetter, (attr,) + attrs))
def func(obj):
return tuple(getter(obj) for getter in getters)
return func |
Perhaps Modules/operator.c should be renamed to Modules/_operator.c. Also note, that error messages in Python an C implementations sometimes differ. |
Sorry to have disappeared on this, other things took priority... Thank you for the comments, Serhiy. v2 of the patch renames Modules/operator.c to Modules/_operator.c, and changes that name every place I could find it. I also tried to tidy up some of the error message mismatches. I didn't bother with the ones regarding missing arguments, as that would mean checking args and throwing an exception in each and every function. I do like the functional attrgetter better than the object version I wrote. The main reason I went with an object version in the first place was because that's what the C implementation used. Is there any reason not to break with the C implementation and use a function instead? The updated patch takes a rather ugly hack to try to use the functional version in an object. length_hint() was horrible and has been rewritten. It should be less horrible now :). It should also follow the C implementation quite a bit better. |
Considering what a huge headache it was to get my own patch to apply at home on Linux rather than at work on Windows, here's a new version of the patch that straightens out the line ending nightmare present in v2. No other changes made. |
Sorry, I forgot push a "Publish All My Drafts" button. Please consider other my comments to first patch. I also have added new comments about length_hint(). Your implementation of attrgetter() looks good. One possible disadvantage of pure functional approach is that attrgetter() will be not a class. Unlikely someone subclass attrgetter, but it can be used in an isinstance() check. You solve this issue. The same approach can be applied to itemgetter(). |
Here's v4, addressing Serhiy's comments on Reitveld. |
About length_hint(): I were mean something like (even explicit getattr() not needed): try: This is a little faster because there is only one attribute lookup instead two. This is a little safer because there is a little less chance of race when an attribute changed between two lookups (it is enough non-probably and doesn't matter). There is type(obj) here because the C code uses _PyObject_LookupSpecial() which doesn't honor instance attributes and looks only class attributes. About concat() and iconcat(): I think only first argument can be checked. If arguments are not concatenable then '+'/'+=' operator will raise an exception. I'm not sure. Does anyone have any thoughts about this? About methodcaller(): Here is a catch. With this implementation you can't use def __init__(*args, **kwargs):
self = args[0]
self._name = args[1]
self._args = args[2:]
self._kwargs = kwargs (You can add a code for better error reporting). I have added smaller comments on Rietveld. |
Here's another new version. Changes include:
On concat and iconcat: Looking at the glossary, a sequence should actually have both __getitem__ and __len__. The test class in the test case for iconcat only defines __getitem__, though. Should we check only for __getitem__ on the first argument, or check for both __getitem__ and __len__, and add __len__ to the test class? Requiring __len__ may cause breakage for anyone using the Python implementation with a class they defined and used with the C implementation with only __getitem__, so I'm leaning towards only checking for __getitem__. I can't really tell what the C implementation really looks for as I don't speak C, but it almost looks to me like it may be only checking for __getitem__. Latest patch only checks argument 'a' for __getitem__. |
Good work, Zachary. I have no more nitpicks for you. ;) LGTM. |
One comment to a committer. Don't forget to run |
Nits are no fun; thank you for picking them, Serhiy ;) |
FYI Mercurial can use the extended diff format invented by git, which supports renames, changes to file permissions, etc. |
The base test class should not inherit from TestCase: it will be picked up by test discovery and then will break, as self.module will be None. Typical usage: class OperatorTestsMixin:
module = None
class COperatorTests(OperatorTestsMixin, unittest.TestCase):
module = _operator |
Did not know that about test discovery, thank you Éric. Fixed in v6. A few other test modules may need the same fix; I based my changes to Lib/test/test_operator.py on Lib/test/test_heapq.py which has the same issue. I'll open a new report for it and any others I find. Also, this patch was created with |
I don't understand what is difference between v5 and v6. |
Sorry, I misunderstood Éric's suggestions regarding the tests; v6 is useless. v7 forthcoming. |
Ok, I believe the attached v7 properly addresses Éric's concerns about test discovery, and has no other changes unrelated to that compared to v5. Thank you very much to Ezio for directing me towards the json tests for an example to work from. |
v8 LGTM (except some trailing whitespaces). |
Note to self: learn to run patchcheck.py before posting. Whitespace issues fixed in v9. |
If no one objects I will commit this next week. |
Since the older Windows project files were removed, v10 removes the patches to them. Everything else still applies cleanly. Also, in the spirit of what Brett said in 16651 about not re-implementing blindly, I did just look up what Jython, IronPython, and PyPy do for the operator module. The first two implement it in their VM language, and PyPy uses a very specialized version that didn't look easy to adapt to CPython, at least at a glance. It was fun for me to write any way about it, though :) |
Zachary, I suppose Modules/_operator.c is a rename of Modules/operator.c. See also http://docs.python.org/devguide/committing.html#minimal-configuration |
Of course; I thought I already had, but apparently I messed that up a bit. v11 is in the proper format. In it, you can actually see what was changed in Modules/operator.c, which is the necessary s/operator/_operator/ changes, and a few extra commas removed from a couple of docstrings (to match the docstrings in the new Python versions).
Thank you for that link! I had read through this some time ago, but either missed the part about the diff section, or it just didn't sink in or something. That is now added to my hg config file :) |
Thank you! |
Here's another new version of the patch, addressing Ezio's review comments and a few things I found after giving operator.py a closer look myself. Things changed in operator.py in this version:
Also, after submitting this patch, I'm going to try to clean up the files list on this issue a bit. I'll clear the nosy list while I do so to avoid spamming everybody with messages about it. (At least, I assume I can do so, I haven't tried this before :). If I can't clear the nosy list, I won't bother with cleaning up the files, again to avoid spamming) |
A change that I mentioned in a Rietveld comment on v10, but not in my last message: __all__ in operator.py no longer includes all of the __func__s, as currently doing "from operator import *" does not import all of the __func__s. |
I think Antoine is more appropriate for committing this patch. I waited so long with this because I do not dare to take responsibility for themselves (it's almost like adding a new module). |
I would like to spend some time with this before it goes forward (especially the attrgetter, itemgetter, methodgetter group). Right now, it looks like a nice effort but I don't see how it makes Python any better for adding it. The odds are that this code will add bloat but not benefit any user (it won't get called at all). |
Raymond: it's not for the benefit of CPython. |
[David]
IIRC, all the other implementations of Python already have this code passing tests, so it isn't really for their benefit either. |
If a pure python operator module were a part of the stdlib, we (PyPy) would probably delete most (if not all) of our own operator module. |
I reviewed the attrgetter(), mathodgetter(), and itemgetter() code in py_operator.v12.diff. The looks clean and correct. |
Now we can remove all __func__s from _operator.c. |
Thank you for the review, Raymond. Since Serhiy agrees that the _operator __func__s are unnecessary, here's a v13 that removes them. Again, I'm not a native C speaker, so these new changes in _operator.c deserve a bit of extra scrutiny. Everything builds and still passes the test suite, though. Also changed in this patch, test_pow and test_inplace remove explicit testing of __func__s. Those tests are useless, as they are merely rerunning already run tests on the same function with a different name, which is confirmed by test_dunder_is_original. I can extend that test with an explicit list of funcs which should have a __func__ if anyone thinks it's worth it. |
length_hint() looks ok as well. |
New changeset 97834382c6cc by Antoine Pitrou in branch 'default': |
I've now commited the latest patch. Thank you very much, Zachary! |
New changeset 4b3238923b01 by Raymond Hettinger in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: