-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not always use exceptions to return result from coroutine #85922
Comments
Currently async functions are more expensive to use comparing to their sync counterparts. A simple microbenchmark shows that difference could be quite significant:
Results from master on my machine: Sync functions: 2.8642687797546387 s NOTE: Due to viral nature of async functions their number in codebase could become quite significant so having hundreds of them in a single call stack is not something uncommon. One of reasons of such performance gap is that async functions always return its results via raising StopIteration exception which is not cheap. This can be avoided if in addition to Sync functions: 2.8861589431762695 s |
Big +1 from me. This is something I always wanted to do myself (since the time of PEP-492 & 525 implementations) and I think this is a necessary change. It's great that this isn't just a C API UX improvement but also yields a big perf improvement. |
Big +1 from me, too, for the same reasons Yury gave. |
Copying some of the design discussion from the PR here (https://github.com/python/cpython/pull/22196/files#r486730457), because it belongs into the ticket. Yury Selivanov proposed to add a new C-API function for this (naming changes by me):
PyGenSendStatus PyGen_Send(PyGenObject *gen, PyObject *arg, PyObject **result); Mark Shannon and I agreed that the status code should be the return value, with some confusion whether "PyGen_" or "PyCoro_" would be appropriate prefixes. Mark Shannon wrote: I don't think [the C-API function] should be public, as a possible further improvement is to stop passing exceptions through a side channel, but in result. Maybe we don't want to do that, but lets' not add to the (already rather large) C-API. However, I think this will be demanded and used by extensions, including Cython implemented ones, so it seems better to make them use a public function than a private one. Let's continue these lines of discussion here. |
If I understand proposed shape of API correctly - it was not supposed to return exception via "result" so contract for new Return value | result | Comment |
Yeah, we can add it as a "private" function, I'm not entirely opposed to that. But... it would be great if Cython and C code could still depend on it and use it. And then... why should it be private? The corresponding Python API "gen.send()" and "gen.throw()" is public, why can't the C API be public too? We will not fundamentally change generators (it would be a major backwards incompatible change), so committing to a good C API sounds reasonable.
Maybe we should call it
Correct. |
Mark, Stefan, I don't want this to be stale so I propose to move with my suggestions:
|
As for returned value, I propose to return -1 in case of error, 1 for yielded value and 0 for returned value (i.e. define PYGEN_RETURN = 0, PYGEN_YIELD = 1 and PYGEN_ERROR = -1, but without exposing public names). It would be uniform with other C API: many functions return -1 on error (if they return int and can fail), and PyDict_Next() and _PySet_NextEntry() return 1 for every yielded item, and 0 if the iteration has been finished. |
Sure, that works. |
I'm happy to see this moving forward. Not convinved of the "PyIter_Send()" name, though. I don't consider this part of the iterator protocol. It's specific to generators and coroutines. Cython would probably guard its usage by "PyGen_CheckExact()" or "PyCoro_CheckExact()", and not use it for arbitrary iterators. Since coroutines inherit the generator protocol more or less, I think "PyGen_Send()" is a more suitable name, better than "PyCoro_Send()". |
Also should it be specific to generators/coroutines and accept PyGenObject* or should it try to handle multiple cases and expose the result for them in uniform way, i.e.
|
OK, +1. PyGen_Send it is.
I think it should be specific to generators and coroutines. Calling |
To add to my point: typically higher-level APIs go under the |
There are other abstract object APIs: PyNumber, PySequence, PyMapping, etc. In particularly PyIter_Next() works with the iterator protocol, there is no single iterator class. Seems PyGen_* API is related to concrete class, but we can introduce new namespace for the generator protocol. |
I agree with Serhiy, that Unless a function takes generator objects and only generators objects, then it shouldn't have a The API function is the C equivalent of obj.send(val). Coroutines do not inherit from generators. Regardless of how this is implemented internally, any API function's signature should reflect the equivalent Python code. I would suggest: |
I guess |
I like |
so to summarize: Proposed function signature:
For generators/coroutines function will delegate to specialized implementation that does not raise StopIteration exception Regarding of the case function will not raise StopIteration and will always return pair status/result. Does it sound correct? |
It does, but given the amount of back and forth on this, I'd wait for Serhiy and Stefan to confirm if they're OK. IMO the |
I would be fine even with a generator-specific API as a first step, for simplicity. But the end goal is to support all generator-like objects. It is much more useful for end users and does not have drawbacks. Enum for result seems not necessary (other functions with three-state result just return -1, 0 or 1), but if other core developers want it -- OK. |
I would also have preferred a more type specific function, but yeah, as long as the types for which the function would normally be used are special cased early enough in the implementation, it makes no big difference. Fine with me, too. |
BTW, just to give this a house number, I remember having measured a performance improvement of up to 70% at some point when switching from "generators always raise a StopIteration at the end" to "generators just return NULL" in Cython. For short-running generators and coroutines, this can make a big difference. |
We introduced _PyObject_LookupAttr() and _PyObject_GetMethod() for similar purposes. And they have similar signatures. Although they all are private. |
Maybe add two API funcs: PyGen_Send (specific to generators & coroutines) and PyIter_Send? |
Sounds like a good middleground to start: add |
Yes, it seems that everybody agreed on that. I can give the PR another review -- is it ready? |
PR 22330 refactors gen_send_ex(), making it similar to the new PyGen_Send(). |
Vladimir, could you please submit a PR to update 3.10/whatsnew? Need to mention both the new C API and the new perf boost in relevant sections. |
Yury, Why was the PR merged with a new API function I explicitly said that any new API function should not start with If you disagree with me, please say why, don't just merge the PR. The name Would you revert the PR, please. |
Apologies, Mark. I didn't intend to merge something bypassing your opinion; just missed your comment between reviewing multiple PRs in a few unrelated repos. I'm sorry. On the actual naming subject, you proposed:
The problem with using this name is that ideally we should also support non-native coroutine and generator implementations (i.e. resolve the "send" attribute and call it using Python calling convention). Ideally we should have two C APIs: one low-level supporting only native objects and a high level one, supporting all kinds of them. Can we perhaps add both
Since this is in 3.10/master that nobody uses right now except us (Python core devs), can we just issue a follow up PR to fix whatever is there to fix? I'd like to avoid the churn of reverting, and again, I apologize for pushing this a bit hastily. Let me know if you actually want a revert and I'll do that. |
I agree that we should add PyIter_Send() (the name is discussable, but PyIter_Send LGTM) which should support iterators and objects with the send() method. It would be much more useful for user, and can replace PyGen_Send() in the interpreter core code. But merging the PR with PyGen_Send() was not a mistake. It was good to split the changes on smaller steps easier to review. Vladimir, do you mind to create a new PR for PyIter_Send()? I seen its implementation in one of intermediate versions of your PR. |
Serhiy, AFAIR PyIter_Send in my PR appear only as a rename from placeholder |
No, I meant a function which combines PyGen_Send, tp_iternext and _PyGen_FetchStopIterationValue. Was not it in your PR? |
No, I don't think so but I can definitely make one. A few questions first:
|
Vladimir, Don't forget to remove PyGen_Send() :) |
That's the function that allows sending data into a generator. It's also used internally by "PyIter_Send()". Are you really suggesting to remove it, or to make it underscore-private again? (As it was before.) |
What's the difference between removing it and making it properly private (i.e. static)? It's an irrelevant detail whether the code is inlined or in a helper function. |
With the latest PR now merged this issue can be closed. Please reopen if there are any other action items left. |
Few things I forget about. The new C API function should be exported in PC/python3dll.c. Also, in Include/abstract.h it should only be available for limited C API >= 3.10: #if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x030A0000
...
#endif We now should discuss the behavior of PySend_Iter() if value is NULL. It is not documented, the current behavior differs from the behavior for non-NULL value, and it is used in the ceval loop. We should document this case explicitly and maybe change the behavior if it would be more appropriate. |
Vladimir, could you please submit a PR?
IMO: I'd keep the behavior and just document it. |
Can we close the issue? |
Hello! Adding new values to enums might break the stable ABI in some (admittedly rare) cases; so usually it's better to avoid enums and stick with int and #ifdef'd values. See pbo-44727 for details. |
that is, bpo-44727 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: