Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dis.dis() reports line numbers where there are none. #107932

Closed
markshannon opened this issue Aug 14, 2023 · 7 comments · Fixed by #107988
Closed

dis.dis() reports line numbers where there are none. #107932

markshannon opened this issue Aug 14, 2023 · 7 comments · Fixed by #107988
Labels
type-bug An unexpected behavior, bug, or error

Comments

@markshannon
Copy link
Member

markshannon commented Aug 14, 2023

Take this example function from #107901:

def f():
   for i in range(30000000):
      if not i%1000000:
          pass

dis.dis produces the following output:

  1           0 RESUME                   0

  2           2 LOAD_GLOBAL              1 (range + NULL)
             12 LOAD_CONST               1 (30000000)
             14 CALL                     1
             22 GET_ITER
        >>   24 FOR_ITER                14 (to 56)
             28 STORE_FAST               0 (i)

  3          30 LOAD_FAST                0 (i)
             32 LOAD_CONST               2 (1000000)
             34 BINARY_OP                6 (%)
             38 TO_BOOL
             46 POP_JUMP_IF_FALSE        2 (to 52)
             48 JUMP_BACKWARD           14 (to 24)

  4     >>   52 JUMP_BACKWARD           16 (to 24)

  2     >>   56 END_FOR
             58 RETURN_CONST             0 (None)

Which has incorrect line numbers.
The issue is not that the line numbers are wrong, but that you can't tell from the dis output.
The whole point of dis is show what is going on at the bytecode level, so it is failing if it gives wrong line numbers.
The actual line numbers can be see from co_lines()

>>> list(f.__code__.co_lines())
[(0, 2, 1), (2, 30, 2), (30, 48, 3), (48, 52, None), (52, 56, 4), (56, 60, 2)]

The correct output should be:

   1           0 RESUME                   0

   2           2 LOAD_GLOBAL              1 (range + NULL)
              12 LOAD_CONST               1 (30000000)
              14 CALL                     1
              22 GET_ITER
         >>   24 FOR_ITER                14 (to 56)
              28 STORE_FAST               0 (i)

   3          30 LOAD_FAST                0 (i)
              32 LOAD_CONST               2 (1000000)
              34 BINARY_OP                6 (%)
              38 TO_BOOL
              46 POP_JUMP_IF_FALSE        2 (to 52)
None          48 JUMP_BACKWARD           14 (to 24)

   4     >>   52 JUMP_BACKWARD           16 (to 24)

   2     >>   56 END_FOR
              58 RETURN_CONST             0 (None)

Linked PRs

@markshannon markshannon added the type-bug An unexpected behavior, bug, or error label Aug 14, 2023
@markshannon
Copy link
Member Author

I've submitted this as a "bug", but it could be argued it is a behavior change, so I think it shouldn't be backported.

@iritkatriel thoughts?

@iritkatriel
Copy link
Member

I think I'd call it a bug and backport.

@CorvinM
Copy link
Contributor

CorvinM commented Aug 14, 2023

Are bytecodes that do not map back to any source line intended behavior? Older versions of python seem to not generate the bytecode that has no source line for this fragment (tested on 3.10)

Was working on a PR for this but most of dis.py makes the assumption of every bytecode has a source line so the behavior change is getting vague and messy quickly. For example findlinestarts() ignores line=None so dis() (and many of the other exported functions that depend on it) group these lines in with the previous block of bytecodes.

I'm happy to keep working on the patch but looking for some assurance the actual "bug" is disassembly related rather than in the bytecode generator before continuing further.

@carljm
Copy link
Member

carljm commented Aug 15, 2023

It is definitely intentional that the compiler can generate instructions that are "synthetic" and don't map directly to any line number in the source. And yes, this is a change in recent Python versions, that didn't used to be possible. So it's not too surprising that dis.py may have "every instruction has a line number" assumptions built into it, and it may not be a trivial change (and may require some design decisions) to untangle those assumptions correctly.

CorvinM added a commit to CorvinM/cpython that referenced this issue Aug 18, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Aug 25, 2023
Adjust expected instruction trace
@serhiy-storchaka
Copy link
Member

It broke the tests.

@AlexWaygood
Copy link
Member

It broke the tests.

They will be fixed by #108478 @serhiy-storchaka

markshannon added a commit that referenced this issue Aug 25, 2023
@AlexWaygood
Copy link
Member

It broke the tests.

They will be fixed by #108478 @serhiy-storchaka

now merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants