-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
PEP 479: Change StopIteration handling inside generators #67095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See PEP for full details. Attached is POC patch: behaviour is altered globally (rather than respecting a __future__ directive), and minimal changes are made elsewhere to make the test suite mostly pass (test_generators does not - it'll need more comprehensive edits). Note that PEP-8 is deliberately not followed here; some of the edits ought to involve indenting slabs of code, but instead, "half-level" indentation is used, to keep the patch visibly minimal. |
New changeset dd19add74b21 by Guido van Rossum in branch 'default': |
AFAIS this would break all existing code for yield-based coroutine schedulers (Tornado, Twisted, Trollius, monocle, ...) when a coroutine is exited with And this seems like a lot, a quick GitHub code search gives various examples, e.g.
|
Marc, are those all cases where the "raise StopIteration" is actually inside a generator? If so, it can be trivially replaced with "return". Yes, it'll break that way of spelling it, but it's a completely mechanical transformation, and making the change won't break your code for previous versions of Python. Personally, I would recommend using "return" there anyway, regardless of this proposal. |
Yes, all yield-based coroutines are generators. I know that there is a backward compatible upgrade path, but this might have a huge impact on existing code. Interestingly, I didn't know before researching this PEP that you can actually use So I'm not even against this proposal and using |
Yep, the question is whether any of the "raise StopIteration" lines are actually non-local flow control. If they're local, then it's easy: mechanical replacement with "return" and it becomes compatible with all versions (unless it has a value attached to it, as "return x" doesn't work in Py2). But if they're non-local, some refactoring will need to be done. In any case, my line of argument is: A generator function is not an iterator's __next__ method, ergo iterator protocol does not apply. Use of StopIteration is a hack that happens to work because of how generator functions are implemented (a thin wrapper around an iterator), but it's not part of the *concept* of a generator function. |
If all examples were just using "raise StopIteration" instead of "return" It's sad that apparently this use of return hasn't been better advertised On Fri, Nov 21, 2014 at 9:44 AM, Chris Angelico <report@bugs.python.org>
|
Sadly, I don't know of a way to check if that's the case, other than by manually going through and eyeballing the code - if there's "raise StopIteration", see if there's also "yield" in the same function. The three cited examples are (I checked those straight away), but frankly, I am not volunteering to go manually through all of github in search of examples :| |
I suggest to apply right now those changes which are compatible with current behavior and don't make code more cumbersome. E.g.
+ yield from line_pair_iterator and
+ return To me these changes make code even better and are worth to be applied even if PEP-479 will be rejected. |
Known issues with the current patch, if anyone feels like playing with this who better knows the code:
Any others? |
Here's a doc patch for itertools. |
FYI: I applied these two changes right after Guido pronounced on PEP-479: https://mail.python.org/pipermail/python-checkins/2014-November/133252.html https://mail.python.org/pipermail/python-checkins/2014-November/133253.html Also, I'm submitting a patch to fix the code in Django that will be broken by PEP-479. |
Extract of emails: changeset: 93542:9eb0d0eb0992 PEP-479: Don't let StopIteration bubble out of calls to next() inside a generator. changeset: 93543:e8b3083bb148 PEP-479: Use the return-keyword instead of raising StopIteration inside a generators. |
Regarding the patch below, isn't most of this redundant? ISTM that simply calling PyErr_SetString(...) should do all of this, including the exception chaining. diff -r 23ab1197df0b Objects/genobject.c
--- a/Objects/genobject.c Wed Nov 19 13:21:40 2014 +0200
+++ b/Objects/genobject.c Thu Nov 20 16:47:59 2014 +1100
@@ -130,6 +130,23 @@
}
Py_CLEAR(result);
}
+ else if (!result)
+ {
+ if (PyErr_ExceptionMatches(PyExc_StopIteration))
+ {
+ PyObject *exc, *val, *val2, *tb;
+ PyErr_Fetch(&exc, &val, &tb);
+ PyErr_NormalizeException(&exc, &val, &tb);
+ Py_DECREF(exc);
+ Py_XDECREF(tb);
+ PyErr_SetString(PyExc_RuntimeError,
+ "generator raised StopIteration");
+ PyErr_Fetch(&exc, &val2, &tb);
+ PyErr_NormalizeException(&exc, &val2, &tb);
+ PyException_SetContext(val2, val);
+ PyErr_Restore(exc, val2, tb);
+ }
+ }
if (!result || f->f_stacktop == NULL) {
/* generator can't be rerun, so release the frame */ |
Stefan, I'm not sure - I don't know the details of the C API here. But I tried commenting out everything but that one line, and while it does result in RuntimeError, it doesn't do the exception chaining. Currently, I believe the exception isn't being caught at all; but I don't know what it would take to do the full catching properly. The current patch doesn't always have a traceback on the original StopIteration, either, so it's definitely not quite right yet. |
Ah, right - chaining only happens automatically when the exception has already been caught and moved to sys.exc_info. There's a _PyErr_ChainExceptions(), though, which does it for you. You should be able to say PyErr_Fetch(&x,&y,&z)
PyErr_SetString()
_PyErr_ChainExceptions(x,y,z) (does pretty much what your code does as well) |
Yeah, I saw that. Since that function begins with an underscore, I thought it best to replicate its behaviour rather than to call it. Either way ought to work though. |
Public underscore C functions are there for exactly the purpose of not duplicating functionality across *internal* core runtime code, without making them an official part of the C-API for *external* code. (I understand that it's a POC patch, though, so whoever applies that change eventually will rework it anyway.) |
FYI, here's the set of tests that I've written for my implementation in Cython: |
As Stefan noted, the leading underscore on "_PyErr_ChainExceptions" is just a warning to third party users that we don't yet offer any API stability guarantees for it. Using it in CPython itself is entirely appropriate - that's the whole reason for exposing it to the linker at all. It would be good to have a public API for that functionality though, so I filed bpo-23188 to suggest making it public. The main thing missing from the patch at this point is the __future__ import. To learn what's involved in that, one of the useful tricks is to look at the annotated __future__ module: https://hg.python.org/cpython/annotate/bbf16fd024df/Lib/__future__.py The commits adding new flags there highlight the various places that tend to be affected when a new __future__ import is added. For this particular case, the closest comparison is likely with the "absolute import" feature. One interesting aspect of this particular flag though is that it doesn't really affect compilation at all, aside from setting the flag on the code objects to indicate the future statement was in effect. The main impact will be at runtime, where it will make the exception replacement in Chris's PoC patch conditional on the presence of the flag on the code object. Chris, will you have time in the near future to rework the patch to add the __future__ support? It would be good to get this into 3.5a1 to give it the longest possible settling in period. |
Heh, I guess that should have been "The main thing missing other than docs and tests is ..." :) |
I can have a poke at the __future__ import tonight, but my main concern is memory management - I'm not sufficiently familiar with the exception handling calls to be sure that I'm neither leaking nor over-freeing anything. There's also a secondary concern that the tracebacks aren't quite right at the moment, possibly caused by a muck-up in the exception chaining. >>> def f(): raise StopIteration
>>> def g(): yield f()
>>> next(g())
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: generator raised StopIteration There's no indication of which function/line caused the exception, which would be _extremely_ helpful. If someone else can look into that at some point, I'd appreciate the assistance. |
Looking more closely at the patch:
|
PyErr_Restore doesn't seem to trigger exception chaining. But thanks for the tip about explicitly setting the traceback; not sure how I missed that, but now the StopIteration traceback is visible. Minor point: The previous patch was setting the __context__ of the RuntimeError, whereas it ought to have been setting __cause__. Have corrected that. So here, before I move further forward, is a new POC patch; I've removed the patches that rhettinger applied already, and fixed up tracebacks. So now it's a better-behaved POC, at least. |
Nick, any particular reason for pointing to https://hg.python.org/cpython/annotate/bbf16fd024df/Lib/__future__.py rather than https://hg.python.org/cpython/annotate/tip/Lib/__future__.py ? I'm looking at both, anyhow. |
Okay! I think I have something here. DEFINITELY needs more eyeballs, but all tests pass, including a new one that tests StopIteration leakage both with and without the __future__ directive. Some docs changes have been made (I grepped for 'stopiteration' and 'generator' case insensitively, and changed anything that caught my eye). It'd be nice if the entire test suite and standard library could be guaranteed to work in a post-3.7 world, but I don't know of an easy way to do that, short of peppering the code with unnecessary __future__ directives. |
Looking for code reviewers! |
Yury's patch mostly looks good to me, except:
|
Thanks!
Done. I've also added one test for correct handling of StopIteration without PEP-479.
Forgot to attach it to the first patch! Nick, please take a look at the new patch (attached). |
The comment was general because I honestly had no idea what was needed still. All I knew was that the patch seemed to work for me, all tests passing (including the new one). Thanks for uploading the new patch; it compiles happily, and I'm running tests now, although that probably won't prove anything new. |
A minor comment about the __future__ changes: 3.5.0a1 should probably be 3.5.0b1. |
A couple of minor comments:
|
Nick, Berker, |
Latest patch LGTM, although I believe the new chaining behaviour checks (I don't see a need to request a new review for that particular change, |
In Lib/future.py: +generator_stop = _Feature((3, 5, 0, "alpha", 1), "alpha" needs to be changed to "beta". |
New changeset 36a8d935c322 by Yury Selivanov in branch 'default': |
Thanks Nick and Berker for the reviews! |
Thanks everyone for all the help getting this to land! This is going to be a part of my active python3 binary from now on :) |
Buildbots are not happy: [ 63/393] test_contextlib Current thread 0x00000200002c2500 (most recent call first): http://buildbot.python.org/all/builders/System%20Z%20Linux%203.x/builds/3162/steps/test/logs/stdio |
Strange, the test suite was running just fine on my machine. I'll take a closer look later today. |
Weird. Tests ran fine on my machine too. Interestingly, that number is 0xdbdbdbdbdbdbdbda - does that mean anything? (It's negative 0x2424242424242426, for what that's worth.) |
I think it crashes in debug mode or something. Somewhere we did too many decrefs. |
New changeset d15c26085591 by Yury Selivanov in branch 'default': |
New changeset 5d8bc813d270 by Yury Selivanov in branch 'default': |
Berker, buildbots should be happy now. |
Thanks! |
Since it took me a moment to figure out why the extra incref was needed:
This does suggest the new incref is conceptually in the wrong spot - it should be before the call to PyException_SetCause, such that this block of code *always* possesses a valid reference while accessing "val". At the moment, we technically still don't have an active reference when passing "val" to PyException_SetContext. |
New changeset 787cc3d1d3af by Yury Selivanov in branch 'default': |
We missed the deprecation warning part of the PEP (for when the future import is *not* in effect, but the default behaviour will change in 3.7), but rather than reopening this one, I filed a new issue: bpo-24237 |
New changeset 2771a0ac806b by Yury Selivanov in branch 'default': |
New changeset c8a3e49f35e7 by Yury Selivanov in branch 'default': |
Consider the following generator function similar to the one that started this issue: from __future__ import generator_stop
def myzip(*seqs):
its = (iter(seq) for seq in seqs)
while True:
try:
yield tuple(next(it) for it in its)
except RuntimeError:
return
g = myzip('abc', 'xyz')
print([next(g) for i in range(5)]) # prints: [('a', 'x'), (), (), (), ()] A print(list(g)) would cause a hang. |
Please ignore my previous comment - now I understood what is going on. The Still learning about generators... :) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: