Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Make yield_ and yield_from_ interoperate with native async generators again #16
Instead of trying to explicitly construct AsyncGenValueWrapper objects, trick the interpreter into doing the wrapping and unwrapping for us. There's still some ctypes hackery involved, but it's much less finicky: just changing the type of an async generator object.
… again Instead of trying to explicitly construct AsyncGenValueWrapper objects, trick the interpreter into doing the wrapping and unwrapping for us. There's still some ctypes hackery involved, but it's much less finicky: just changing the type of an async generator object.
@@ Coverage Diff @@ ## master #16 +/- ## ===================================== Coverage 100% 100% ===================================== Files 7 7 Lines 972 992 +20 Branches 77 79 +2 ===================================== + Hits 972 992 +20
@njsmith ping? I'm interested in getting this merged and a new async_generator version released with it and #15 so I can depend on that version in other projects (e.g. make trio do something sane with the GC hooks). I'd be happy to get #17 in too, but that's a larger change and I don't want to tie this one's fate to that one. :-)
So first of all, I want to say again how incredibly impressive this work is on the technical level! I am in awe.
If I understand correctly, the pitch for this PR is "you can use
If I have the right, then I don't this PR is worth it. It's not that hard to write
OTOH, #17 is somewhat attractive, if we can make it essentially transparent, so people don't write their code any differently but just magically get better runtime behavior. The tracebacks are really bad. I guess I am particularly annoyed at them because of how trio's nursery implementation uses
It's such an intrusive change that it does make me nervous –
Thanks for the review!
I would say the pitch of #16 is "you can use
(And if you leave off the
And, of course, the other half of the pitch for #16 is that it's required in order for #17 to work. I think I agree with you that we shouldn't commit #16 if we don't intend to commit some form of #17. I think that to the extent there's a case to be made for committing both of them, it goes like so:
If you're writing an async application that targets Python 3.6+ only, you're probably going to use native async generators, because they take less typing to write / are faster to execute / don't add lots of traceback frames / are arguably syntactically clearer. If your application uses an async library that wants to support 3.5, that library is probably going to use
Regarding PyPy support: Apparently PyPy has reasonably solid 3.6 support on nightly these days, including async generators. I played around with its async generators and, via a similar type-punning scheme to the one used in this PR for CPython, got my hands on something that appears to be an AsyncGenValueWrapper object. Unfortunately, any attempt to use it beyond id() segfaults. My understanding of the PyPy object memory layout is based on a blog post from 2009, so it's not unreasonable that these difficulties might be resolved with a less cursory effort. (Very much not guaranteed to work out, though.)
I think #17 can be modified to make it transparent without much trouble. Currently it is very loud and conservative about the one tiny difference it's aware of (the ag_running thing), but I think it would be reasonable for us to just declare that if "ag_running is False while an asend is active but suspended" is good enough for CPython, it's good enough for us. As dicey as bytecode introspection can be in the general case, this one is looking for an extremely simple pattern with no potential for false positives -- maybe some wacky stack management could make us fail to detect an asyncgen that is truly safe, but we'll never think something is safe (returns only None) when it has non-None return paths. And the mechanics of the code object reconstruction is exactly the same thing that the stdlib
Any thoughts? I'm inclined to push a little more on seeing whether I can make this native integration strategy work with PyPy, and will feel less bullish about these diffs if I can't.
PyPy's inclusion of alpha support for 3.6 in their 7.0.0 release made me revisit this. I found a way to create native wrapped values with no memory hackery at all:
This works because if you create a function whose code object has both
(And this works on CPython too!)
The nontrivial downside: the PyPy version of AsyncGenValueWrapper has no application-level type. If you try to perform any operation on it directly, like printing it or getting its
I guess I was being extremely unimaginative: if someone calls
We could try to convince the PyPy folks to make AsyncGenValueWrapper into a real type, like it is in CPython, but I think it might be more likely that they'd decide the mechanism I was using to get them is a bug, and close that "hole" instead.
Ah well. A fun experiment at any rate.