You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been looking at the code for fastcall and vectorcall functions, and one issue seems to be that for a function like:
def f(a, *args, **kwds):
# something...
the wrapper function immediately generates a tuple for args and a dict for kwds, which presumably misses most of the benefits of the more efficient calling method.
What I was considering is creating some types cython.fastcalltuple and cython.fastcalldict that args and kwds could be typed with. These would ultimately reduce to a basic, non-refcounted structure, and so could be used directly and more efficiently. However, they would only support very limited operations.
For example cython.fastcalltuple could support:
integer indexing
very simple slicing (integer start and end, positive step)
Assignment to split into multiple values a, b, c = args
Explicit conversion to a Python sequence (e.g. tuple(args))
Maybe calling other functions with them under very limited circumstances (i.e. when they can be fed directly into a fastcall/vectorcall)
Anything else would fail at compile-time, so it couldn't auto-coerce to a Python object. cython.fastcalldict would be similarly restricted (although I've given it less thought).
The idea being that for a lot of functions these operations are all that's ever done to the star/starstar args, so this provides a fairly easy user-selectable optimization. (Obviously a future step might be to apply the optimization automatically when possible, but that's obvious a bit harder).
It's possible that I've mis-understood something and this wouldn't provide real benefits, hence I'm creating an issue in advance of doing anything so that I can be told why I'm wrong.
The text was updated successfully, but these errors were encountered:
Good idea. It wouldn't have to fail at compile time, though. Instead, we could infer these types for the two arguments automatically and then have them auto-coerce to tuple and/or dict at need (and separately), when we detect unsafe usages. We do this for potentially overflowing arithmetic as well, which turns inferred C integer variables into Python object variables.
I've been looking at the code for fastcall and vectorcall functions, and one issue seems to be that for a function like:
the wrapper function immediately generates a tuple for
args
and a dict forkwds
, which presumably misses most of the benefits of the more efficient calling method.What I was considering is creating some types
cython.fastcalltuple
andcython.fastcalldict
thatargs
andkwds
could be typed with. These would ultimately reduce to a basic, non-refcounted structure, and so could be used directly and more efficiently. However, they would only support very limited operations.For example
cython.fastcalltuple
could support:a, b, c = args
tuple(args)
)Anything else would fail at compile-time, so it couldn't auto-coerce to a Python object.
cython.fastcalldict
would be similarly restricted (although I've given it less thought).The idea being that for a lot of functions these operations are all that's ever done to the star/starstar args, so this provides a fairly easy user-selectable optimization. (Obviously a future step might be to apply the optimization automatically when possible, but that's obvious a bit harder).
It's possible that I've mis-understood something and this wouldn't provide real benefits, hence I'm creating an issue in advance of doing anything so that I can be told why I'm wrong.
The text was updated successfully, but these errors were encountered: