Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use EXTENDED_ARG_QUICK in unquickened code #95113

Closed
brandtbucher opened this issue Jul 21, 2022 · 6 comments
Closed

Don't use EXTENDED_ARG_QUICK in unquickened code #95113

brandtbucher opened this issue Jul 21, 2022 · 6 comments
Labels
3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@brandtbucher
Copy link
Member

EXTENDED_ARG_QUICK is a problematic opcode, and I think we should expose it less than we currently do. It is the only "quickened" opcode emitted by the compiler and found in code.co_code, which requires us to jump through strange hoops in several places (I'm looking at you, _PyOpcode_Original). Even weirder, it traces as EXTENDED_ARG at runtime.

  • it's not documented anywhere in the dis documentation
  • it's not even a "real" opcode exposed by any of the helpers in the opcode module (which still defines a misleadingly useless EXTENDED_ARG constant)
  • it's less than HAVE_ARGUMENT, even though it (obviously) uses its argument
  • it's not understood by things like stack_effect

Yet consumers of the bytecode must handle it correctly. We've already seen several bugs in CPython where we wrongly thought that we were handling extended arguments, and code that's been unchanged for years silently stopped working correctly. Let's avoid putting user code through the same pain.

I propose that we only emit the normal EXTENDED_ARG in the compiler, like we always have, and use _PyOpcode_Deopt everywhere that _PyOpcode_Original is being used now. We'll quicken EXTENDED_ARG to EXTENDED_ARG_QUICK just like all of the other *_QUICK opcode variants. No need to change the magic number or anything, since old code will continue to work fine. Although EXTENDED_ARG is in theory a tiny bit slower (it deopts the next opcode using _PyOpcode_Deopt), I doubt this is actually measurable.

Thoughts? I really think we should try to do this in 3.11, since it's a relatively simple change.

CC @markshannon, @sweeneyde, @pablogsal

@markshannon
Copy link
Member

Yes, I think we should do this. It is how it should have been done all along.

It is worth re-reading the discussion in #91945 first.
I think that the obvious approach of emitting EXTENDED_ARG in the compiler and quickening to EXTENDED_ARG_QUICK wouldn't have worked then due to the (slightly broken) way we handled extended args when tracing.
I think that is fixed now as we preserve the oparg when tracing, so it should work OK now.

Assuming I understand correctly:

  • The compiler would emit EXTENDED_ARG
  • Quickening will convert EXTENDED_ARG to EXTENDED_ARG_QUICK
  • _PyOpcode_Deopt[EXTENDED_ARG_QUICK] == EXTENDED_ARG

I think we will need to change

        TARGET(EXTENDED_ARG) {
            oparg <<= 8;
            oparg |= _Py_OPARG(*next_instr);
            opcode = _PyOpcode_Deopt[_Py_OPCODE(*next_instr)];
            PRE_DISPATCH_GOTO();
            DISPATCH_GOTO();
        }

to:

        TARGET(EXTENDED_ARG) {
            int old_oparg = oparg;
            NEXTOPARG()
            oparg |= old_oparg <<= 8;
            PRE_DISPATCH_GOTO();
            opcode |= cframe.use_tracing;
            DISPATCH_GOTO();
        }

As an aside, I prefer _PyOpcode_Original to _PyOpcode_Deopt as an name (but it's not important).

@brandtbucher
Copy link
Member Author

I think we will need to change

        TARGET(EXTENDED_ARG) {
            oparg <<= 8;
            oparg |= _Py_OPARG(*next_instr);
            opcode = _PyOpcode_Deopt[_Py_OPCODE(*next_instr)];
            PRE_DISPATCH_GOTO();
            DISPATCH_GOTO();
        }

to:

        TARGET(EXTENDED_ARG) {
            int old_oparg = oparg;
            NEXTOPARG()
            oparg |= old_oparg <<= 8;
            PRE_DISPATCH_GOTO();
            opcode |= cframe.use_tracing;
            DISPATCH_GOTO();
        }

It's a bit trickier than that. We would also need to teach TRACING_NEXTOPARG to handle EXTENDED_ARG arguments too, which I worry could hurt tracing performance further.

Thankfully, for all recent CPython versions (3.7 was the furthest back I checked), we haven't ever traced any opcode following an EXTENDED_ARG, even with frame.f_trace_opcodes = True. I brought this up on discord last week, and @gvanrossum made a convincing case that it's likely intentional (since the EXTENDED_ARGs, the true instruction, and the following CACHEs are conceptually a single "unit"). I could go either way on the issue (tracing the following opcode feels right, even if it's trickier, and no tests fail if you fix it), but it's been that way for years and nobody's complained about it yet. 🤷🏼‍♂️

At any rate, the PR I opened for this (#95121) doesn't change anything about the tracing behavior. The only difference between EXTENDED_ARG and EXTENDED_ARG_QUICK is still that EXTENDED_ARG deopts the next instruction so that tracing invariants aren't broken.

@markshannon
Copy link
Member

We would also need to teach TRACING_NEXTOPARG to handle EXTENDED_ARG arguments too.

I thought we already had fixed tracing to not stomp on the oparg. Apparently not. It is easy enough to do that.

which I worry could hurt tracing performance further.

It shouldn't.

I brought this up on discord last week, and @gvanrossum made a convincing case that it's likely intentional

I agree. One event per instruction makes more sense than one per code-unit.

@brandtbucher
Copy link
Member Author

brandtbucher commented Jul 22, 2022

Sorry, can you clarify? (I'm still stuck between the "instruction" and "code-unit" terminology.) Do you think my PR's approach is good, or do you also want to "fix" the tracing behavior (which skips tracing instructions after an EXTENDED_ARG) at the same time?

@brandtbucher
Copy link
Member Author

Never mind, just saw your comment on the PR.

@markshannon
Copy link
Member

Sorry, can you clarify?

#94437

brandtbucher added a commit to brandtbucher/cpython that referenced this issue Jul 22, 2022
…de (pythonGH-95121).

(cherry picked from commit e402b26)

Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

No branches or pull requests

2 participants