-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implicit ABCs have no means of "anti-registration" #70146
Comments
Serhiy Storchaka raised an issue (http://bugs.python.org/msg256910) with static type hints on bpo-25864 (http://bugs.python.org/issue25864), which I believe also applies to runtime ABCs. Consider this class, which uses class Thing:
def __getitem__(self, specialization):
return type(self)._specialize(specialization)(self)
def __len__(self):
return len(self._specializations) Because this type looks like it supports the old-style sequence protocol, calling either __iter__ = None
__reversed__ = None ... or this: def __iter__(self): raise TypeError('not iterable')
def __reversed__(self): raise TypeError('not iterable') Unfortunately, doing either means that There are a few possible solutions here:
Possible argument for #1: The counter-argument is that static typehinting definitely does have this problem (python/typing#170), and, depending on how that's solved, it may well make sense to use the same solution here. If we do need a solution, #2 seems better than #3 (or anything else I could think up). The only problem there is that |
As Guido pointed out on -ideas, hashing already uses the convention of Meanwhile, setting So, maybe that should be documented as the standard way to unimplement |
As Serhiy pointed out on -ideas, there's no reason |
Hashable and Awaitable already treat None (actually any falsey value) as not implementing the special method, and blocking any superclass implementation, in their __subclasshook__. (This only blocks implicit subclassing--anything that actually directly or indirectly inherits from or registers with Hashable is still Hashable even if it sets __hash__ = None.) Coroutine, AsyncIterable, AsyncIterator, Iterable, Iterator, Generator, Sized, Container, and Callable treat None as an implementation. http://bugs.python.org/file41440/cpython-iter-patch.diff for bpo-25864 changes Iterable to work like Hashable, but doesn't touch any of the other types. To fix this in general, just apply the same change to the other implicit ABCs (and probably factor out the logic instead of repeating it that many more times). But I'm not sure a general fix is needed here. The special methods for those other types don't have a base class and/or fallback implementation to block the way __hash__, __iter__, and __reversed__ do (except for __call__ on metaclasses, but I can't imagine why you'd want a metaclass that isn't callable), so there's no need to set them to None to block that, so the problem should never come up. |
I don’t think you need to define __len__() to get an iterable, only __getitem__(). Also, if I understand your problem, Container would also be susceptible in theory, because membership testing via “in” and “not in” falls back to iteration. If you only define __getitem__(), your class is both iterable and a container, but it is neither an Iterable nor a Container. :) |
Absolutely you do not need to define __len__ to get an iterable. Length is not a property of an iterable (iterables can be indefinite in length or infinite in length). |
The "old-style sequence protocol" means having a __getitem__ that works for values from 0 to __len__() and raises IndexError at __len__(). You don't need to be a complete old-style sequence to be iterable; just having __getitem__ makes you iterable (without being a collections.abc.Iterable or a typing.Iterable), and having __getitem__ and __len__ makes you reversible (without being a typing.Reversible). At any rate, this bug isn't about avoiding false negatives for the implicit ABCs, but false positives: defining __iter__ = None blocks the old-style sequence protocol, but makes isinstance(Iterable) true. |
You're right, but not in the details. Being iterable (whether via __iter__ or via the old-style sequence protocol) makes you a container. But, again, false negatives for Container aren't a problem. But blocking that by setting __contains__ = None makes you a Container but not a container, the same kind of false positive as bpo-25864. That's exactly why I split off this bug from that one--that one only fixes __iter__ and __reversed__, but it's possible that a more general solution is needed. (Or, of course, maybe we don't need anything more general, we just need to expand it to __iter__, __reversed__, and __contains__.) |
I propose to solve the narrow problem by indeed supporting the setting of certain special methods to None similar to __hash__. This should be limited to those methods for which this adds value, e.g. where the complete absence of the method causes a fall-back to a different API (e.g. __getitem__+len in the case of __reversed__) -- the None value should just block the fallback. This will require some changes in the ABCs that test for the presence of a given method (Iterable/iter, Container/contains, Reversible/reversed -- the latter ABC should be added) and in implementations such as reversed(), iter() and 'in'. Maybe we should just do this for every ABC in collections.abc (plus Reversible) that has a __subclasshook__ that tests for the presence of the special method. Oh, probably some more, like Iterator/next, Sized/len and Callable/call. But for those there isn't a fallback in the corresponding bytecode, so I'm only +0 for those -- it doesn't seem to harm anything to be consistent with those for which it does matter, but it doesn't cost much either, and the consistency is slightly useful -- it provides a pattern to follow for future ABCs. |
I believe that currently, except for the minor problem in bpo-25864, you can set any special method to None and it _will_ block any fallback (and also block inheritance, of course, the reason it's used with __hash__):
In all cases, if the primary method exists but is None, the implementation will try to call it, and let the TypeError from NoneType.__call__ escape, which blocks the fallback. (And, of course, it blocks the MRO search for an inherited method.) Of course the message is the useless and cryptic "'NoneType' object is not callable" rather than the nice one you get from hash. But technically, it's all already working as desired. We probably need to document that setting special methods to None blocks any fallback implementation (both for people who write such classes, and to ensure that other Pythons do the same thing as CPython). Practically, I think we need to improve the error message for the ones that are likely to come up, but probably not for all of them. I think this means either just reversed, or reversed+iter, or maybe reversed+iter+in. (Plus hash, of course, but that code is already there.) Also, I suppose we should have tests for this behavior.
Now that I think about it, I'm +0.5 on changing all of them. Imagine, as a reader, trying to guess why some do the check and some don't. Also, I think they should test for None instead of falsiness like Hashable does. (Hopefully nobody ever writes something that's falsey but callable, but why tempt fate?) --- I'll write a patch that expands the bpo-25864 patch:
|
I like all of that. Thanks! |
Is an hg export patch usable? If not, let me know and I'll flatten it. Anyway, the attached patch fixes bpo-25987 and bpo-25864 as well as this one, as follows:
(Note that this patch is not yet fully tested, because my Windows and Cygwin builds are for some reason taking forever. But if I don't post an update tomorrow, that will mean they passed the overnight tests. I doubt anyone was going to commit this tonight anyway, but, just in case...) |
A Mercurial export patch should work with the Reitveld review thing if it is a single revision (and based on a public revision). Or you could try getting a plain diff from the base revision. Or for changes that are independent, separate patches based off a common public revision. Please at least fold your fixup commits into the original commits; that would make it a lot easier to review. You have a stray print() call, probably in Hashable.__subclasshook__(). For a the do-nothing __reversed__() implementation, do you think “yield from ()” would be clearer than “while False”? Or even “return iter(())”? What’s the advantage of having Reversed inherit Iterable? How does the subclass hook interact with classes that implement __reversed__() but not __iter__()? I don’t see how the self.assertFalse(issubclass(K, Reversible)) test could pass. Should the new Reversed class be excluded from collections.__all__, or is it not worth it? I find the lambda generator a bit quirky in test_Reversible(). Maybe it would be better as an ordinary non-lambda generator with a yield statement. It would be good to have a versionadded notice for Reversible. I think there should also be one for allowing None for all special methods. Instead of self.assertEqual(list(reversed(R())), list(reversed([]))), why not check it is equal to an empty list directly? In test_contains.py, I would either write lambda: 0 in bc, or use the “with self.assertRaises()” form. Finally, if setting a binary operator method to None means that operation is not available, I find it a bit surprising that doing this prevent the other operand’s method from being tried. I guess you don’t want to change the implementation, but perhaps the documentation of the binary operator methods could be clarified. |
Well, it's 7 separate commits on a work branch, so I could check in each piece and test it separately, and then I just compared the branch to its parent (default) for the export. But, as I said, I can flatten it into a single diff if the review system can't handle this way. Sorry for making the first patch hard to review, and thanks for reviewing it anyway; I'll fix that with the next one.
Thanks; fixed for the next patch.
Maybe, but this way is identical to the definition for the do-nothing Iterable.__iter__.
This was discussed briefly... somewhere, maybe the -ideas thread? Pro: If they were independent types, more people would probably register with/inherit from Reversible but not Iterable in error than intentionally. Pro: One less thing for Sequence to inherit from. This was mentioned in the context of adding both Reversible and HalfSequence (or whatever it ends up being called): "class Sequence(Iterable, Reversible, HalfSequence, Sized, Container) seems a little much; "class Sequence(Reversible, HalfSequence, Sized, Container)" is slightly better. Con: It's not completely impossible that a type could be reversible but not iterable. For example, you might have a reversible, and even indexable-with-negative-indicies-only, infinite collection of all the negative numbers. (But nobody could think of an actual _useful_ example of such a type.) Those are all pretty trivial points. When Guido said someone needs to work out where Reversible fits into the hierarchy, and there'd been enough talk without a patch. So I looked at where the discussion seemed to be leaning, couldn't find any additional problems with it when I looked into it further, and did it that way. Guido said he liked it on the Reversible bug, so I did it the same way here. So, no strong arguments or even feelings from anyone; if you've got one, definitely chime in.
Exactly the same way Iterator's handles a class that defines __next__ but not __iter__. I'll double-check that it actually is implemented right before the next patch (and verify that the tests run), but it should be "return _check_methods(C, "__reversed__", "__iter__").
It's Reversible, not Reversed. It's in __all__, and I'm pretty sure it should be there--if it's not worth exporting, it's not worth creating in the first place, or documenting, right?
This is taken from an identical test in test_Iterable, in the same file.
Definitely; thanks for the catch. I'll add that for the next version of the patch.
I'm assuming you mean in collections.abc, not in the data model, right? The problem is where to put it. The collections.abc docs don't even mention that some of the ABCs have subclass hooks that detect "implicit subclasses", much less tell you which ones do and don't, much less tell you which among those that do treat None specially. Until we add that documentation, there's really nothing to contrast with. Maybe this means we need another bug where we rewrite the collections.abc docs to include all that information, and in the new documentation we can note what prior versions of Python did (or maybe prior versions of CPython--I doubt any major implementations were different, but without checking them all I wouldn't want to guarantee that)?
This might be a case of following the parallelism with test_Iterable too far; I can change it.
That makes sense; I'll do it.
I definitely don't want to change the implementation unless there's a very good reason. Also, if you think it through, from either angle, the way it's always worked makes perfect sense. None (in fact, anything that raises a TypeError when called) always blocks fallback, so it blocks fallback to other.__radd__. Alternatively, the rules for when + looks for __radd__ are very specific and reasonably simple, and TypeError is not one of the things that triggers it. Making None trigger __radd__ would require making one of those two rules more complicated. And what would the benefit be? If you want to trigger fallback, there's already a simple, well-documented, readable way to do that; we don't need another one. And I think the note would be more likely to confuse someone who didn't need it (and maybe make him think he should be defining __add__ = None where he really shouldn't) than to help someone who did. Actual reasonable use cases for __add__ = None are rare enough that I think it's OK that someone has to understand what they're doing and be able to think through the rules and/or know what combinations to test via trial and error before they can use it. But maybe a footnote would work? For example, in the sentence "These functions are only called if the left operand does not support the corresponding operation and the operands are of different types. [2]", we could add another footnote after "does not support the corresponding operation", something like this:
Again, thanks for all the notes. |
I'm regenerating the patch in the hope that it will trigger the code review hook. |
Note that setting some special methods (such as __copy__, __deepcopy__, __reduce_ex__, __reduce__, __setstate__, etc) to None has different meaning. The value None is just ignored and corresponding operation is fall back to other methods. For example copy.deepcopy() uses __reduce_ex__ if __deepcopy__ is None, __reduce__ if __reduce_ex__ is None. May be this should be changed. |
FWIW, Martin's review was much more extensive. I'm looking forward to the flat version of your next patch, Andrew! |
In response to Serhiy's comment regarding __copy__ etc.: while the |
There is already a precedent for None, but it would have been nicer to use NotImplemented. |
The idea of using NotImplemented was already discussed (IIR in |
Needed tests for Hashable, Awaitable, Coroutine, AsyncIterable, AsyncIterator, Iterator, Generator, Sized, Container, Callable. The patch adds some tests for __iadd__ and __eq__. I think needed tests for a number of other special methods. In particular should be tested a classes without __iadd__, but with __add__ but to None; with __radd__, but with __add__ set to None; without __add__, but with __radd__ set to None; with __eq__, but with __ne__ set to None, etc. Added other comments on Rietveld. |
What if use None and NotImplemented as different signals: "not defined here so fall back on either a superclass or a different protocol", and "not defined here so fail without falling back"? |
No. Simply No. Nobody will be able to remember which means which. |
True, even I were not sure which should mean which. ;)
I think this triggered one of Victor's guards, added to catch such sort of errors. In this case the function both raises an exception and returns a result. If not catch such sort of error, it can cause unexpected errors or very strange results during executing unrelated code, so crashing as early as possible is lesser evil. |
The second patch takes into account all the issues raised by Martin and Guido, as well as some other changes that didn't make it into the first patch because Windows hates me. And it should be flattened into a single commit, and therefore should hopefully work with Rietveld out of the box. It passes the test suite on Windows and Cygwin (and on OS X, but that's a slightly older version of the changes than what's in this flattened patch). I think it's still an open question whether Reversible should inherit Iterable. In the current patch, as in the first, it does. I'll go over Serhiy's Reitveld comments to see if there's anything I missed, and, if so, address it in a third attempt. On Serhiy's test suggestions:
|
Without tests you can't be sure that there is no special case in some abstract classes or operators, or that it can't be introduced in future. To decrease the copy-pasting you can use code generation for testing. But I don't know what is the best place for tests. Tests for special methods are scattered though different files: test_binop, test_class, test_compare, test_descr, test_richcmp, test_augassign, test_contains, test_iter, etc. |
The attached patch fixes the one wording change from the patch2 review, and makes all the error messages consistent as suggested by Serhiy, and also adds a lot more tests: every way that a special method can fall back should now be tested. (Of course it would be good if someone else went through and made sure I didn't miss anything.) However:
Yes, but there's clearly a cost to maintaining and running 13 copies of each __ispam__ test, and 20 copies of each __rspam__ test. And the benefit is minuscule--if all 13 methods share the same code; it's far more likely that someone will break the __isub__ test than that they'll break the __isub__ code.
Yeah, they pretty much have to be scattered around the code. As they are in the attached diff. But I don't think that's a huge problem, except for the poor sap who has to review the patch. :) |
Uploading a flattened version of patch3.diff. |
FWIW, it looks fine to me -- but I'm hoping to get Serhiy's agreement first. Serhiy: feel free to commit when you're happy. |
Added few stylistic nitpicks and asked few questions on Rietveld. |
I don’t have strong opinions about the Reversible class because I don’t imagine needing it. My instinct was __reverse__() is independent of __iter__(), so it should not be a subclass. But I don’t really mind either way. I did actually mean a version changed notice for the data model change. I see this as a small expansion of the Python object API. Previously, __reversed__() had to be a function, now you are also allowed to set it to None. The collections ABCs are just catching up with the API change. Imagine someone using an Orderable virtual base class that tested for __gt__() etc. If you need, you can write repetitive tests without copying and pasting or generated code: for [param, result1, result2] in parameters:
with self.subTest(param=param):
... Maybe it is okay to add a test to ABCTestCase.validate_isinstance() to check that None cancels isinstance(). I also added some Reitveld comments for patch4a. |
Style changes based on Martin's review |
Are you suggesting that in Python 3.5, it's not defined what happens if you set __reversed__ = None on a sequence-like object and then call reversed()? I think any implementation has to raise a TypeError; the only difference is that we're now documenting the behavior directly, rather than forcing you to infer it from reading a half-dozen different parts of the docs and tracing through the implied behavior. It's already the best way to do it in 3.5, or even 2.5; the problem is that it isn't _obviously_ the best way. And I think a version-changed would send the wrong message: someone might think, "Oh, I can't use None here because I can't require 3.6, so what should I do? Write a method that raises TypeError when called?" (Which will work, but will then make it very hard to write a Reversible implicit ABC if they later want to.)
That's part of what we're trying to solve. Do you test that with hasattr, like Sized, or do you test for getattr with default None, like Hashable? |
This is not really my area of expertise, but I would have thought if you defined a __special__ method to something illegal (non-callable, or wrong signature) it would be reasonable for Python to raise an error at class definition (or assignment) time, not just later when you try to use it. Somewhere I think the documentation says you are only allowed to use these names as documented. |
As Guido pointed out on the -ideas thread, defining __spam__ = None to block inheritance of a superclass implementation has long been the standard way to mark a class unhashable. So, any Python that raised such an error would break a lot of code.
I can't find anything that says that. Any idea where to look? That might be worth adding, but if we add it at the same time as (or after) we explicitly document the None behavior, that's not a problem. :) By the way, did you not review my last patch because you didn't get an email for it? I think Rietveld #439 (http://psf.upfronthosting.co.za/roundup/meta/issue439) may be causing issues to get stalled in the patch stage because people are expecting to get flagged when a new patch goes up but never see it (unless they're on the same mail domain as the patcher). |
No, that's not the intention.
Indeed, but it's not enforced. What it means is that when the next (And please don't go suggesting that we start enforcing it.) |
The documentation about double-underscore names: <https://docs.python.org/3/reference/lexical_analysis.html#reserved-classes-of-identifiers\>. It says any undocumented usage may be broken. The technique with __hash__() is already documented: <https://docs.python.org/3/reference/datamodel.html#object.\_\_hash__\>. I have seen your 5th patch, and do appreciate the new improvements. (I think I did see an email for it. But a while ago G Mail started treating a lot of stuff as spam until I made a special rule for bug tracker emails.) |
What holds back this issue now? It looks like Andrew did a great job. Is any help needed here? |
I have manually "rebased" the patch taking into account that part of the work has been done in http://bugs.python.org/issue25987 Serhiy, Martin could you please review the latest patch and check that everything is OK? Andrew did a really good job and I would like this to land in 3.6 |
Oops, sorry, forgot one import, all tests pass with the corrected patch. |
The patch LGTM. Do all the tests pass? Who should be attributed in the commit message and the Misc/NEWS item? |
New changeset 72b9f195569c by Guido van Rossum in branch 'default': |
Pushed, let's see what the buildbots say. |
On my machine 8 tests are always skipped: The main part of the work is by Andrew Barnert, I only introduced small changes due to previous changes in Reversible and few minor things in response to previous comments. Guido, thank you for applying the patch! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: