Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_sys_settrace -R 3:3 does crash #109052

Closed
vstinner opened this issue Sep 7, 2023 · 7 comments
Closed

test_sys_settrace -R 3:3 does crash #109052

vstinner opened this issue Sep 7, 2023 · 7 comments

Comments

@vstinner
Copy link
Member

vstinner commented Sep 7, 2023

$ ./python -m test test_sys_settrace -R 3:3 -m test_line_event_raises_before_opcode_event -m test_no_jump_backwards_into_for_block 
0:00:00 load avg: 0.77 Run tests sequentially
0:00:00 load avg: 0.77 [1/1] test_sys_settrace
beginning 6 repetitions
123456
.Fatal Python error: Segmentation fault

Current thread 0x00007f8a40b15740 (most recent call first):
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 1883 in trace
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 2433 in test_no_jump_backwards_into_for_block
(...)

Regression introduced by: commit 6f3c138

commit 6f3c138dfa868b32d3288898923bbfa388f2fa5d
Author: Serhiy Storchaka <storchaka@gmail.com>
Date:   Wed Sep 6 23:55:42 2023 +0300

    gh-108751: Add copy.replace() function (GH-108752)
    
    It creates a modified copy of an object by calling the object's
    __replace__() method.
    
    It is a generalization of dataclasses.replace(), named tuple's _replace()
    method and replace() methods in various classes, and supports all these
    stdlib classes.

Linked PRs

@vstinner
Copy link
Member Author

vstinner commented Sep 7, 2023

Reverting the codeobject.c is enough to workaround the regression:

diff --git a/Objects/codeobject.c b/Objects/codeobject.c
index 70a0c2ebd6..58306075ca 100644
--- a/Objects/codeobject.c
+++ b/Objects/codeobject.c
@@ -2145,6 +2145,7 @@ static struct PyMethodDef code_methods[] = {
     {"co_positions", (PyCFunction)code_positionsiterator, METH_NOARGS},
     CODE_REPLACE_METHODDEF
     CODE__VARNAME_FROM_OPARG_METHODDEF
+    {"__replace__", _PyCFunction_CAST(code_replace), METH_FASTCALL|METH_KEYWORDS},
     {NULL, NULL}                /* sentinel */
 };
 

With this revert:

$ make && ./python -m test test_sys_settrace -R 3:3
(...)
Total duration: 18.8 sec
Total tests: run=439 skipped=1
Total test files: run=1/1
Result: SUCCESS

cc @serhiy-storchaka

@serhiy-storchaka
Copy link
Member

I do not understand how adding an alias of an existing method, which is not even used here, can cause a crash.

cc @gaogaotiantian

@gaogaotiantian
Copy link
Member

I'll take a look at it tomorrow if Mark did not come up with a quick reasoning/solution. Confirmed that the fix for #108976 does not help.

@gaogaotiantian
Copy link
Member

The ultimate issue for this lives in code_richcompare where it compares the opcodes one by one of the code objects. However, when the code object is instrumented, the opcodes would be INSTRUMENTED_LINE or INSTRUMENTED_INSTRUCTION, so the comparasion would be invalid. Moreover, the opcode to be compared should be the main opcode (not the cache entries), which relies on a correct calculation of cache entry number of the current opcode. As the opcode becomes INSTRUMENTED_INSTRUCTION, that number to skip the cache is wrong too, which caused the SegFault.

The crash requires a specific pattern, in this case, a cache entry with the opcode field that happened to have ENTER_EXECUTOR. Obviously no executors would be there and there's a NULL dereference. So the bisected commit did not introduce the issue, it might just be the trigger for this crash. The issue was there for a long time.

There's an interesting test to run:

import sys

code1 = compile("x is x", "example.py", "eval")
code2 = compile("x in x", "example.py", "eval")

sys._getframe().f_trace_opcodes = True
sys.settrace(lambda *args: None)
exec(code1, {"x": []})
exec(code2, {"x": []})
print(code1 == code2)
sys.settrace(None)

You'll find that code1 == code2, even though they are obviously not the same thing. Because the instructions are instrumented and are considered the same.

For now, I directly stripped the instrumentation for the opcode, which resolves the crashing issue and the case above (regression test added in test_code). However, this implementation means we do not care whether the code is instrumented or not. So for two "same" code objects, if one of them is instrumented and the other is not, we still consider them "the same".

I don't know the correct answer to this, are they the same? We can easily add a check for the instrumented instructions, it's not difficult, but we probably need to determine the "correct answer" for that first.

@brandtbucher

@vstinner
Copy link
Member Author

vstinner commented Sep 8, 2023

Is it possible to compare the original bytecode on code1==code2, rather than the instrumented bytecode?

I think @markshannon proposed an alternative, remove code comparison operator. I recall that i worked on that operator for a strange statement building two code objects on the same line. The regression test should be somewhere in test_compile, I suppose.

@corona10
Copy link
Member

corona10 commented Sep 8, 2023

Is it possible to compare the original bytecode on code1==code2, rather than the instrumented bytecode?

It's possible.
see : #108188
The approach of #109107 looks good to me.

@vstinner
Copy link
Member Author

corona10 closed this as completed

I confirm that the bug is fixed:

$ ./python -m test test_sys_settrace -R 3:3
(...)
Result: SUCCESS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants