Skip to content

gh-135700: Fix instructions in __annotate__ have incorrect code positions #135814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 212 commits into from

Conversation

AndPuQing
Copy link
Contributor

@AndPuQing AndPuQing commented Jun 22, 2025

from dis import Bytecode, dis

src = """\
22
333
__dataclass_fields__: ClassVar
"""

code = compile(src, "<string>", "exec")
print(code.co_consts)
print(dis(src))
annotate_code = code.co_consts[1]

for bc in Bytecode(annotate_code):
    print(bc.positions, bc.opname)

output

(0, <code object __annotate__ at 0x7b19f4095f20, file "<string>", line 1>, None)
  --           MAKE_CELL                0 (__conditional_annotations__)

   0           RESUME                   0

   1           LOAD_CONST               1 (<code object __annotate__ at 0x7b19f4096c30, file "<dis>", line 1>)
               MAKE_FUNCTION
               STORE_NAME               1 (__annotate__)
               BUILD_SET                0
               STORE_NAME               0 (__conditional_annotations__)

   2           NOP

   3           LOAD_NAME                0 (__conditional_annotations__)
               LOAD_SMALL_INT           0
               SET_ADD                  1
               POP_TOP
               LOAD_CONST               2 (None)
               RETURN_VALUE

Disassembly of <code object __annotate__ at 0x7b19f4096c30, file "<dis>", line 1>:
  1           RESUME                   0
              LOAD_FAST_BORROW         0 (format)
              LOAD_SMALL_INT           2
              COMPARE_OP             132 (>)
              POP_JUMP_IF_FALSE        3 (to L1)
              NOT_TAKEN
              LOAD_COMMON_CONSTANT     1 (NotImplementedError)
              RAISE_VARARGS            1
      L1:     BUILD_MAP                0

  3           LOAD_SMALL_INT           0
              LOAD_GLOBAL              0 (__conditional_annotations__)
              CONTAINS_OP              0 (in)
              POP_JUMP_IF_FALSE       10 (to L2)
              NOT_TAKEN
              LOAD_GLOBAL              2 (ClassVar)
              COPY                     2
              LOAD_CONST               1 ('__dataclass_fields__')

  1           STORE_SUBSCR
      L2:     RETURN_VALUE
None
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_FAST_BORROW
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_SMALL_INT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) COMPARE_OP
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) POP_JUMP_IF_FALSE
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) NOT_TAKEN
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_COMMON_CONSTANT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RAISE_VARARGS
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) BUILD_MAP
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) LOAD_SMALL_INT
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) LOAD_GLOBAL
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) CONTAINS_OP
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) POP_JUMP_IF_FALSE
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) NOT_TAKEN
Positions(lineno=3, end_lineno=3, col_offset=22, end_col_offset=30) LOAD_GLOBAL
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) COPY
Positions(lineno=3, end_lineno=3, col_offset=0, end_col_offset=30) LOAD_CONST
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) STORE_SUBSCR
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) RETURN_VALUE

…e-135700.0qdtCl.rst

Co-authored-by: sobolevn <mail@sobolevn.me>
@JelleZijlstra
Copy link
Member

Good catch, thanks! Would you be able to add a unit test? Also test a class with annotations so we can see how it works in a class-level __annotate__ function.

@JelleZijlstra JelleZijlstra added the needs backport to 3.14 bugs and security fixes label Jun 22, 2025

for case_name, annotate_code in test_cases:
with self.subTest(case=case_name):
instructions = list(dis.Bytecode(annotate_code))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use dis.findlinestarts(annotate_code) instead.

AndPuQing and others added 3 commits June 24, 2025 14:34
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
),
]

for case_name, annotate_code in test_cases:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert here that annotate_code is a code object and that annotate_code.co_name is '__annotate__'.

for instr in dis.get_instructions(annotate_code)
if setup_scope_begin <= instr.offset < setup_scope_end
and instr.positions
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be so complicated? Don't we just need to check that all the instructions in this code object have the same line number (excluding Nones)?

Copy link
Contributor Author

@AndPuQing AndPuQing Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the important thing is to make sure the instructions in setup_annotations_scope are consistent with the end_col_offset of RESUME, not just the lineno

no fix:

Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME       
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_FAST_BORROW         # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_SMALL_INT           # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) COMPARE_OP               # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) POP_JUMP_IF_FALSE        # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) NOT_TAKEN                # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) LOAD_COMMON_CONSTANT     # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) RAISE_VARARGS            # incorrect
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=2) BUILD_MAP                # incorrect

fix:

Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_FAST_BORROW
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_SMALL_INT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) COMPARE_OP
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) POP_JUMP_IF_FALSE
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) NOT_TAKEN
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) LOAD_COMMON_CONSTANT
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RAISE_VARARGS
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) BUILD_MAP

And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we need to check they all have the same end_col_offset as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.

The test doesn't need to work for all inputs. Only the two inputs it runs on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And not all annotate lines are the same, because some opcodes need to indicate the "code" where the annotation occurs.

The test doesn't need to work for all inputs. Only the two inputs it runs on.

Just to be clear, instructions within code object can correspond to different line numbers. So I made an offset restriction here, limiting the check to between RESUME and BUILD_MAP (setup_annotations_scope)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much of an intuition on how line numbers should work for synthetic code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@15r10nk are you able to suggest a meaningful unit test for this? Alternatively, can you confirm that the PR resolved the issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can verify that this fixes the issue which I have reported 👍 . I also found out that the reported bug affected debugging with pdb.

example.py

import example_mod
print(example_mod.__annotations__)

example_mod.py

22

class ClassVar():
    pass

__dataclass_fields__: ClassVar
./python -m pdb example.py 
> /home/orbit/projects/cpython/example.py(1)<module>()
-> import example_mod
(Pdb) b example_mod.py:2
Breakpoint 1 at /home/orbit/projects/cpython/example_mod.py:2
(Pdb) c
> /home/orbit/projects/cpython/example_mod.py(2)<module>()      # first hit which is expected
-> 22
(Pdb) c
> /home/orbit/projects/cpython/example_mod.py(2)__annotate__()  # pdb should not break here
-> 22
(Pdb) l
  1  	
  2 B->	22
  3  	
  4  	class ClassVar():
  5  	    pass
  6  	
  7  	__dataclass_fields__: ClassVar
  8  	
[EOF]
(Pdb) bt
  /home/orbit/projects/cpython/Lib/bdb.py(899)run()
-> exec(cmd, globals, locals)
  <string>(1)<module>()
  /home/orbit/projects/cpython/example.py(4)<module>()
-> print(example_mod.__annotations__)
> /home/orbit/projects/cpython/example_mod.py(2)__annotate__()
-> 22
(Pdb) 

The breakpoint is hit a second time when the synthetic code is executed which uses the incorrect source positions.

I remember that I found multiple different problems with similar synthetic codes like this in the past (super(),except related and maybe more) but I was always able to work around this issues and I did not know that they can also lead to problems when you debug with pdb. I will write a tool to find these problems in the next days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

The pdb example can probably be used for a unit test (in test_pdb). I'd remove the test that is currently in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @iritkatriel @AndPuQing , I wrote a test with test_pdb.py maybe that can be used for this pr.

class PdbTestCase(unittest.TestCase):
    
    ...
    
    def test_issue135700(self):
        """ https://github.com/python/cpython/issues/135700 """
        module_code = """\
22

class ClassVar:
    pass
__dataclass_fields__: ClassVar
"""
        with open('testmod.py', 'w') as f:
            f.write(module_code)
        self.addCleanup(os_helper.unlink, "testmod.py")

        script = """
import testmod
print(testmod.__annotations__)
"""
        commands = """
b testmod.py:1
c
c
"""
        result = self.run_pdb_script(script, commands)
        self.assertNotIn("(1)__annotate__()", result[0])

StanFromIreland and others added 29 commits July 11, 2025 19:48
…ry (pythonGH-136146)

---------

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Éric <merwok@netwok.org>
…ython#136430)

Adds a test skip on platforms where hard links are not available (which includes Android).
…136253)

Adds zstd to the Android build process.

---------

Co-authored-by: Malcolm Smith <smith@chaquo.com>
…nts (python#135585)

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
…#135998)

Implement a statistical sampling profiler that can profile external
Python processes by PID. Uses the _remote_debugging module and converts
the results to pstats-compatible format for analysis.


Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
…le (python#136519)

Omit perf_jit_trampoline from "JIT" codeowners
…honGH-132273)

* Fix flag mask inversion when unnamed flags exist.

For example:

    class Flag(enum.Flag):
        A = 0x01
        B = 0x02
        MASK = 0xff

    ~Flag.MASK is Flag(0)

* EJECT and KEEP flags (IntEnum is KEEP) use direct value.

* correct Flag inversion to only flip flag bits

IntFlag will flip all bits -- this only makes a difference in flag sets with
missing values.

* correct negative assigned values in flags

negative values are no longer used as-is, but become inverted; i.e.

    class Y(self.enum_type):
        A = auto()
        B = auto()
        C = ~A        # aka ~1 aka 0b1 110 (from enum.bin()) aka 6
        D = auto()

    assert Y.C. is Y.B|Y.D
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.