-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Improve the C code for calling Python code: _PyEval_EvalCode() #87156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently, to make a call to Python (modules, classes, etc, not just functions) from C has to use the monster that is
In outline _PyEval_EvalCode performs the following actions:
As _PyEval_EvalCode or its earlier equivalents have gained arguments, they have a left of list of legacy functions. It is not clear what is the "correct" function to use in C extensions. Hopefully extension code uses the To improve this situation, I propose: A new C struct, the "frame descriptor" which contains the code object, defaults, globals, and names that describe the code to be executed. Split the Eval functions into vector and tuple/dict forms, both forms taking a "frame descriptor", as well as the arguments. Mark the older forms of the Eval functions as "legacy", creating a local "frame descriptor" and transforming the arguments in vector or tuple/dict forms, in the legacy functions. Factor out the frame creation part of the Eval functions. The above changes will be necessary for PEP-651, but I think would be a worthwhile improvement even if PEP-651 is not accepted. |
PyObject * Hum no, only 16 arguments, everything is fine! :-D More seriously, I already considered to rework this code. The pseudo code is: if (...) return <new generator>; I like the idea of splitting these two parts: f = create_frame_or_gen(...);
if (<is generator>) return f;
retval = _PyEval_EvalFrame(tstate, f, 0);
_PyObject_GC_TRACK(f);
return retval; I see one advantage: stack memory consumation, we don't have to hold the 16 arguments on the stack, only 3 parameters (tstate, f, 0).
Hum, it seems like you have a different idea how to refactor the code. Would it be worth it to have more specialized create_frame_or_gen() functions for the common cases? -- I would also be interested by the ability to not create a frame at all, but I guess that it's a way larger refactoring. |
Rather than: f = create_frame_or_gen(...);
if (<is generator>) return f;
retval = _PyEval_EvalFrame(tstate, f, 0);
_PyObject_GC_TRACK(f);
return retval; I was thinking: f = create_frame(...);
if (<is generator>) return make_gen(f);
retval = _PyEval_EvalFrame(tstate, f, 0);
_PyObject_GC_TRACK(f);
return retval; The complicated part is create_frame(...), so I want to clean that up first. |
+1 on exposing f,builtins. Of course, the thing I'd really want is a way to state that all references to builtins are meant to have the exact semantics of those builtins, so the compiler can translate e.g. len(x) into a new opcode that just calls PyObject_Size(). (I can dream, can't I?) Another dream: assume that globals that refer to modules, classes or functions don't change, so they can be cached more aggressively. I suppose enough checking of dict version tags can get us there, or at least close enough. |
This change is related to bpo-43228 "Regression in function builtins": cloudpickle is broken by this issue (new PyFunctionObject.func_builtins member).
I tried to implement such optimization in my old https://faster-cpython.readthedocs.io/fat_python.html project. I implemented guards to de-optimize the code if a builtin is overriden.
The dict version comes from my FAT Python project: PEP-509. It is used to optimize method calls. See also PEP-510 and PEP-511 ;-) |
Well maybe I'll be picking up those ideas again... Thanks for documenting |
FWIW the globals opcode cache handles all of this now. There's no point in specifically optimizing the builtins lookup since we optimize all global lookups for a code object that's hot enough. |
So you think that even a dedicated "LEN" opcode would not be any faster? (This is getting in Paul Sokolovsky territory -- IIRC he has a variant of Python that doesn't allow overriding builtins.) |
I reopen the issue since there is bpo-43228 regression, caused by this issue, which is still under discussion, and Mark also proposed to add a new builtins parameter to the function constructor (FunctionType). I wrote PR 24564 to help fixing bpo-43228 regression: with this change, functions now inherit the current builtins if the globals namespace is overriden, but the new globals has no "__builtins__" key. This change is backward incompatible on purpose. If someone really wants to run a function in a different builtins namespace, globals['__builtins__'] must be set explicitly. Once PR 24564 will be merged, I plan to write a 3rd PR to add an optional builtins keyword-only parameter to the function constructor (FunctionType). |
Yeah, a dedicated LEN opcode could only be faster if it would not be possible to shadow builtins (or if there was a "len" operator in Python). If that's not the case, this hypothetical LEN opcode would still have to check if "len" was shadowed or not, and that's slower than the optimized LOAD_GLOBAL we have now. |
Victor
You say it's on purpose, what's the purpose? Aren't you worried this is going to break stuff? And why is this necessary given the LOAD_GLOBAL cache? Yury
It could use the same check though? Just check the version tags. |
There is a subtle behavior difference between Python 3.9 and Python 3.10. func_builtins2.py of bpo-43228 works on Python 3.9 but fails on Python 3.10. With my PR 24564, func_builtins2.py works again on Python 3.10. See bpo-43228 for the details.
My PR 24564 is not related to LOAD_GLOBAL, but how a frame fills its f_builtins member from a function. LOAD_GLOBAL uses f_globals and f_builtins members of a frame. |
I rephrased PR 24564 to clarify the scope of the incompatible change: in practice, only the types.FunctionType constructor changes. Defining functions in Python using "def function(...): ...", eval(code, {}) and exec(code, {}) are not affected. eval() and exec() already inherit the current builtins if globals does not contain "__builtins__" key. |
Updated PR documentation: Defining a function with "def function(...): ..." in Python is not This PR makes FunctionType makes more consistent with other Python functions. Also, it doesn't prevent people to attempt building a "sandbox", it remains possible to override __builtins__ in FunctionType, eval(), exec(), etc. Usally, such sandbox pass a modified builtins namespace to eval() and exec() and the functions simply inherit it, functions defines with "def function(...): ..." and functions created with types.FunctionType constructor: my PR only impacts a very specific case, when types.FunctionType is called with a different globals dictionary which has no "__builtins__" key. |
Thanks, that's clearer. I'm still worried about the change in semantics where globals["__builtins__"] is assigned a different dict after the function object has been created (similar to https://bugs.python.org/file49816/func_builtins2.py). I.e. def foo(): return len("abc")
code = foo.__code__
g = {"__builtins__": {"len": len}}
f = FunctionType(code, g)
f() # Succeeds
g["__builtins__"] = {}
f() # Fails in 3.9 and before, passes in 3.10 Assuming code uses len, does f() succeed or fail? I realize this is a pretty esoteric, but it does show the change in semantics (from later to earlier binding). Should we care? I like early binding because it allows more optimizations[1], but traditionally Python's semantics use late binding. [1] Not in this case, the user could still change the meaning of len() with e.g. g["__builtins__"]["len"] = lambda x: return 42 |
In Python 3.9 the binding is more late-ish binding, than true late binding. Because globals['__builtins__'] is cached for each function activation, executing functions don't see updates. Example: >>> def f():
... print(len("test"))
... bltns = f.__globals__["__builtins__"]
... if hasattr(bltns, "__dict__"):
... bltns = bltns.__dict__
... new = bltns.copy()
... new["len"] = lambda x : 7
... f.__globals__["__builtins__"] = new
... print(len("test"))
...
>>>
>>> f()
4
4
>>> f()
7
7 True late binding would print: >>> f()
4
7
>>> f()
7
7 |
Guido: "I'm still worried about the change in semantics where globals["__builtins__"] is assigned a different dict after the function object has been created (...)" Well, there is a semantics change of Python 3.10 documented at: "Functions have a new __builtins__ attribute which is used to look for builtin symbols when a function is executed, instead of looking into __globals__['__builtins__']. (Contributed by Mark Shannon in bpo-42990.)" And the function __builtins__ attribute is read-only. Your example is not affected by PR 24564 because the globals has the "__builtins__" key. In Python 3.10, you can modify func.__builtins__ (new attribute): def foo(s): return len(s)
code = foo.__code__
FunctionType = type(foo)
f = FunctionType(code, {"__builtins__": {"len": len}})
print(f("abc"))
f.__builtins__.clear()
print(f("abc")) Output: 3
Traceback (most recent call last):
(...)
NameError: name 'len' is not defined Mark: "Because globals['__builtins__'] is cached for each function activation, executing functions don't see updates." In Python 3.10, if someone wants to hack builtins while the function is running, modifying the builtins namespace in-place works as expected: def f():
print(len("test"))
builtins_ns = f.__globals__['__builtins__'].__dict__
#builtins_ns = f.__builtins__
builtins_ns['len'] = lambda x: 7
print(len("test"))
f() Output: It also works with "builtins_ns = f.__builtins__". Guido: "I realize this is a pretty esoteric, but it does show the change in semantics (from later to earlier binding). Should we care? I like early binding because it allows more optimizations[1], but traditionally Python's semantics use late binding." Modifying built-in functions/types is commonly done in tests. Example: import unittest.mock
def func():
with unittest.mock.patch('builtins.chr', return_value='mock'):
return chr(65)
print(func()) The expected output is: "mock". Overriding an attribute of the builtins module immediately updates func.__builtins__. It works because func.__builtins__ is builtins.__dict__. In FAT Python, I implemented an optimization which copies builtin functions to constants, replace LOAD_GLOBAL with LOAD_CONST: This optimization breaks this Python semantics, it is no longer possible to override builtin functions in tests: |
Thanks for the clarifications, Victor! So I now understand: the *identity* of the builtins dict used by a function's code is captured when the function object is created. And I understand why: it's so that in the fast path for the LOAD_GLOBAL opcode we won't have to do a dict lookup in globals to find the builtins. Regarding the text in What's New 3.10 about this at https://docs.python.org/dev/whatsnew/3.10.html#other-language-changes, I recommend adding there that func.__builtins__ is initialized from globals["__builtins__"], if it exists, else from the frame's builtins, when the function object is created; like you state in #24564. Or perhaps make one of these paragraphs refer to the other for details, since they are duplicate mentions of the same behavior change (once the latter PR lands). Also, thanks to you and Mark and everyone else who has worked on optimizations like the globals cache and the method cache. |
Good idea, I updated my PR.
IMO it's useful to have two different paragraphs. One about the new attribute which is not really a semantics change, and one about the semantics changes when globals["__builtins__"] doesn't exist. For people who only care about Python 3.10 incompatible changes, they can simply read the Porting to Python 3.10 > Changes in the Python API section ;-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: