-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-135700: Fix instructions in __annotate__ have incorrect code positions #135814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Misc/NEWS.d/next/Core_and_Builtins/2025-06-22-10-47-27.gh-issue-135700.0qdtCl.rst
Outdated
Show resolved
Hide resolved
…e-135700.0qdtCl.rst Co-authored-by: sobolevn <mail@sobolevn.me>
Good catch, thanks! Would you be able to add a unit test? Also test a class with annotations so we can see how it works in a class-level |
9045ebe
to
aed06b7
Compare
Lib/test/test_dis.py
Outdated
|
||
for case_name, annotate_code in test_cases: | ||
with self.subTest(case=case_name): | ||
instructions = list(dis.Bytecode(annotate_code)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use dis.findlinestarts(annotate_code)
instead.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
), | ||
] | ||
|
||
for case_name, annotate_code in test_cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert here that annotate_code
is a code object and that annotate_code.co_name
is '__annotate__'
.
for instr in dis.get_instructions(annotate_code) | ||
if setup_scope_begin <= instr.offset < setup_scope_end | ||
and instr.positions | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be so complicated? Don't we just need to check that all the instructions in this code object have the same line number (excluding Nones)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the important thing is to make sure the instructions in setup_annotations_scope
are consistent with the end_col_offset
of RESUME, not just the lineno
no fix:
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_FAST_BORROW # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_SMALL_INT # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) COMPARE_OP # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) POP_JUMP_IF_FALSE # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) NOT_TAKEN # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_COMMON_CONSTANT # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) RAISE_VARARGS # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) BUILD_MAP # incorrect
fix:
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_FAST_BORROW
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_SMALL_INT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) COMPARE_OP
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) POP_JUMP_IF_FALSE
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) NOT_TAKEN
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_COMMON_CONSTANT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RAISE_VARARGS
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) BUILD_MAP
And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so we need to check they all have the same end_col_offset as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.
The test doesn't need to work for all inputs. Only the two inputs it runs on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.
The test doesn't need to work for all inputs. Only the two inputs it runs on.
Just to be clear, instructions within code object can correspond to different line numbers. So I made an offset restriction here, limiting the check to between RESUME and BUILD_MAP (setup_annotations_scope)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @JelleZijlstra has an idea how to write a meaningful test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much of an intuition on how line numbers should work for synthetic code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@15r10nk are you able to suggest a meaningful unit test for this? Alternatively, can you confirm that the PR resolved the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can verify that this fixes the issue which I have reported 👍 . I also found out that the reported bug affected debugging with pdb.
example.py
import example_mod
print(example_mod.__annotations__)
example_mod.py
22
class ClassVar():
pass
__dataclass_fields__: ClassVar
./python -m pdb example.py
> /home/orbit/projects/cpython/example.py(1)<module>()
-> import example_mod
(Pdb) b example_mod.py:2
Breakpoint 1 at /home/orbit/projects/cpython/example_mod.py:2
(Pdb) c
> /home/orbit/projects/cpython/example_mod.py(2)<module>() # first hit which is expected
-> 22
(Pdb) c
> /home/orbit/projects/cpython/example_mod.py(2)__annotate__() # pdb should not break here
-> 22
(Pdb) l
1
2 B-> 22
3
4 class ClassVar():
5 pass
6
7 __dataclass_fields__: ClassVar
8
[EOF]
(Pdb) bt
/home/orbit/projects/cpython/Lib/bdb.py(899)run()
-> exec(cmd, globals, locals)
<string>(1)<module>()
/home/orbit/projects/cpython/example.py(4)<module>()
-> print(example_mod.__annotations__)
> /home/orbit/projects/cpython/example_mod.py(2)__annotate__()
-> 22
(Pdb)
The breakpoint is hit a second time when the synthetic code is executed which uses the incorrect source positions.
I remember that I found multiple different problems with similar synthetic codes like this in the past (super()
,except
related and maybe more) but I was always able to work around this issues and I did not know that they can also lead to problems when you debug with pdb. I will write a tool to find these problems in the next days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you!
The pdb example can probably be used for a unit test (in test_pdb). I'd remove the test that is currently in this PR.
output