-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Zero cost" exception handling #84403
Comments
C++ and Java support what is known as "zero cost" exception handling. The basic principle is that the compiler generates tables indicating where control should be transferred to when an exception is raised. When no exception is raised, there is no runtime overhead. (C)Python should support "zero cost" exceptions. Now that the bytecodes for exception handling are regular (meaning that their stack effect can be statically determined) it is possible for the bytecode compiler to emit exception handling tables. Doing so would have two main benefits.
|
+1! I was going to implement this, but first I wanted to implement support of line number ranges instead of just line numbers (co_lineno). We need to design some compact portable format for address to address mapping (or address range to address mapping if it is more efficient). Are you already working on this Mark? I would be glad to make a review. |
This is an exciting prospect. Am looking forward to it :-) |
+1 |
To clarify, would there be any observable difference in behavior aside from speed? And would there be any limitations in when the speedup can be applied? |
The only observable changes will be changes in the code object: new attributes and constructor parameters, changed .pyc format, dis output, etc. |
The changes to pyc format aren't user visible so shouldn't matter, Consider this program: def f():
try:
1/0
except:
return "fail" Currently it compiles to: 3 2 LOAD_CONST 1 (1) 4 >> 16 POP_TOP 5 22 POP_EXCEPT With zero-cost exception handling, it will compile to something like: None 14 PUSH_EXCEPT 4 16 POP_TOP 5 22 POP_EXCEPT (There are additional optimizations that should be applied, but those are a separate issue) The problem is that the exception handling flow is no longer visible. |
We can add a new column for the offset or the index of the error handler. Or add pseudo-instructions (which do not correspond to any bytecode) at boundaries of the code with some error handler. |
I like Serhiy’s idea. BTW, what are the three POP_TOP op codes in a row popping? |
When exceptions are pushed to the stack, they are pushed as a triple: (exc, type, traceback) |
Responding to Serhiy's suggestions: |
I've played around with a few formats, and what I've ended up with is this:
This has all the information, without too much visual clutter. The function `f` above looks like this:
>>> dis.dis(f)
2 0 NOP 3 2 LOAD_CONST 1 (1) 4 20 POP_TOP 5 26 NOP The 'lasti' field indicates that the offset of the last instruction is pushed to the stack, which is needed for cleanup-then-reraise code. |
This seems to have broken the address sanitizer buildbot: https://buildbot.python.org/all/#/builders/582/builds/116/steps/5/logs/stdio Example error: ================================================================ |
To reproduce with a modern gcc: % export ASAN_OPTIONS=detect_leaks=0:allocator_may_return_null=1:handle_segv=0 |
Seconded, also seeing the same ASAN failure on the fuzzers with a blame for this commit. |
I tried some debugging code: diff --git a/Python/ceval.c b/Python/ceval.c
index f745067069..a8668dbac2 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -4864,6 +4864,18 @@ get_exception_handler(PyCodeObject *code, int index)
return res;
}
scan = skip_to_next_entry(scan);
+ if (scan
+ >= (unsigned char *)PyBytes_AS_STRING(code->co_exceptiontable)
+ + PyBytes_GET_SIZE(code->co_exceptiontable))
+ {
+ printf("co_name: --------------------------\n");
+ _PyObject_Dump(code->co_name);
+ printf("co_filename: ----------------------\n");
+ _PyObject_Dump(code->co_filename);
+ printf("co_exceptiontable: -------------\n");
+ _PyObject_Dump(code->co_exceptiontable);
+ printf("\n\n\n\n\n");
+ }
}
res.b_handler = -1;
return res; It output this: Python 3.11.0a0 (heads/main-dirty:092f9ddb5e, May 9 2021, 18:45:56) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from test.test_statistics import *
co_name: object address : 00000254B63EFB80 >>> unittest.main()
............................................................................................................................................................................................................................................................................................................................................................................ Ran 364 tests in 24.409s OK Here is the disassembly of the offending function: >>> from dis import dis
>>> from importlib._bootstrap import _find_and_load
>>> dis(_find_and_load)
1024 0 LOAD_GLOBAL 0 (_ModuleLockManager)
2 LOAD_FAST 0 (name)
4 CALL_FUNCTION 1
6 BEFORE_WITH
8 POP_TOP 1025 10 LOAD_GLOBAL 1 (sys) 1026 24 LOAD_FAST 2 (module) 1027 32 LOAD_GLOBAL 5 (find_and_load_unlocked) 1024 40 ROT_TWO 1027 52 RETURN_VALUE 1026 >> 54 NOP 1024 56 LOAD_CONST 1 (None) 1029 >> 90 LOAD_FAST 2 (module) 1030 98 LOAD_CONST 2 ('import of {} halted; None in sys.modules') 1031 100 LOAD_METHOD 6 (format) 1030 106 STORE_FAST 3 (message) 1032 108 LOAD_GLOBAL 7 (ModuleNotFoundError) 1034 >> 120 LOAD_GLOBAL 8 (_lock_unlock_module) 1035 128 LOAD_FAST 2 (module) I don't know whether there just needs to be a sentinel 128 appended to all co_exceptiontable, or if there is a more subtle bug. |
Thanks everyone for the triaging and fixing. |
It seems that we have broken the stable ABI of PyCode_NewWithPosOnlyArgs as it has now another parameter. We need to either create a new private constructor or a new public constructor, but the ABI cannot change. |
I know PyCode_NewWithPosOnlyArgs is declared as "PyAPI_FUNC" but that can't make it part of the ABI unless it has stable behavior. Passing the same arguments to PyCode_NewWithPosOnlyArgs for both 3.9 and 3.10 will cause one or other version to crash (interpreter crash, not just program crash). We need to stop adding "PyAPI_FUNC" to everything. The only sane ways to construct a code object are to load it from disk, to compile an AST, or to use I can revert the API changes and add a new function, but I think that is dangerously misleading. A compilation error is preferable to an interpreter crash. |
I agree with you but we already went through this when I added positional-only arguments and everyone complained that Cython and other projects were broken and we changed a stable API function so I am just mentioning that we are here again. Honestly, I think what you describe makes sense and this constructor should never be stable (as the Python one is not stable). If we add column offsets that's another parameter more than we would need to add. But in any case that's just my opinion and we should reach some conclusion collectively, that's why I mentioned this here. |
In any case, if we decide to let it stay, at the very least this behaviour (that these functions are not stable) needs to be documented everywhere: what's new, C-API docs...etc And probably we need to somehow add it to the future deprecations of 3.10 for visibility. As I mentioned, I simphatise with your argument and I think it makes sense, but we cannot just do it in a BPO issue, I'm afraid. |
PyCode_NewWithPosOnlyArgs is not part of the stable ABI. It is OK to break its ABI in a minor version (i.e. 3.11). The PyAPI_FUNC makes it part of the public *API*. It needs to be source- compatible; the number of arguments can't change. Could yo u add a new function? I wouldn't remove PyCode_NewWithPosOnlyArgs from the public C API, which can be CPython-specific and used by projects like Cython that need some low-level access for performance. But PEP-387 applies, so if it is deprecated in 3.11, it can be removed in 3.13. |
Unfortunately, no: new functions cannot be added easily because the new field that receives is needed and is a complicated field created by the compiler. The old API is not enough anymore and making a compatibility layer is a huge complexity. |
Then, according to PEP-387, "The steering council may grant exceptions to this policy." I think API breaks like this do need coordination at the project level. |
Small note, as I mentioned in my correction email (https://mail.python.org/archives/list/python-dev@python.org/message/GBD57CUUU4K5NMQDTEZXNCX76YISEIGQ/), this is a release blocker for 3.11 (it was not marked in the issue what Python version was associated, I am doing it with this message) so this doesn't block the 3.10 release. |
Ah, okay. So we're not on the hook to decide this today. :-) |
I'd like to close this, as the exception handling is all done and working correctly. Is there a separate issue for how we are handling CodeType()? |
No, that's why this is marked as release blocker, because this is the first issue where CodeType was changed. If you wish to close this one, please open a new issue explaining the situation and mark that one as release blocker. |
I propose we declare all APIs for code objects *unstable*, liable to change each (feature) release. I want to get rid of PyCode_NewWithPosArgs() and just have PyCode_New(). All callers to either one must be changed anyways. (And we’re not done changing this in 3.11 either.) |
That as added because of PEP-387 and unfortunately removing it is backwards incompatible.
I agree that we should do this, but this needs at least a discussion in python-dev because currently these APIs are protected by PEP-387 so changing them is backwards incompatible |
Is changing the signature allowed? Because it *must* be changed (at the very least to accommodate the exceptiontable, but there are several others too -- your PEP-657 touched it last to add endlinetable and columntable). I think this was a mistake in PEP-387 and we just need to retract that. Perhaps it could be left as a dummy that always returns an error?
Yeah that's the crux. :-( |
I've started a thread on python-dev. |
Can this be closed? |
Yes. |
See bpo-47185: code.replace(co_code=new_code) no longer catch exceptions on Python 3.11. |
Surely the bigger issue is that the contents of new_code itself must be totally different? Also there are other tables that need to be adjusted if you really do change co_code, e.g. the debugging tables. |
I created bpo-47236 "Document types.CodeType.replace() changes about co_exceptiontable". |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: