Skip to content

Conversation

@Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Nov 14, 2025

This was responsible for the pretty big perf regression in the final benchmarks over the old JIT. It got missed in the flurry of commits and reviews at the end.

# Inner loop warms up first.
# Outer loop warms up later, linking to the inner one.
# Therefore, at least two executors.
self.assertGreaterEqual(len(get_all_executors(f)), 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain, why this test checks _DEOPT vs _EXIT_TRACE pair?
On my machine it passes both on this PR and on main branch.
Using option --enable-experimental-jit=interpreter or --enable-experimental-jit=yes also doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm it shouldn't pass on main. Let me strengthen the test.

Copy link
Member

@efimov-mikhail efimov-mikhail Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version of test fails on main with --enable-experimental-jit=interpreter or with --enable-experimental-jit=yes as expected.

@efimov-mikhail
Copy link
Member

efimov-mikhail commented Nov 15, 2025

Also, I've noticed related warning:

Python/optimizer.c: In function_PyJit_translate_single_bytecode_to_trace’:
Python/optimizer.c:600:11: warning: operand of ‘?:’ changes signedness fromintto ‘unsigned intdue to unsignedness of other operand [-Wsign-compare]
  600 |         ? (int)(target_instr - _Py_INTERPRETER_TRAMPOLINE_INSTRUCTIONS_PTR)
      |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Nov 15, 2025

I'm going to merge this as it's a minor but important fix and I have an approval. Sorry Mark if you want to review it. We can make any further changes in your PR that comes with the switch-case for trace recording.

@efimov-mikhail
Copy link
Member

efimov-mikhail commented Nov 15, 2025

JFTR, I don't know much about the perfomance implications of this PR.
But I'm sure that it's a clean fix with a correct test.

@Fidget-Spinner
Copy link
Member Author

Well, I don't know the exact perf implications either. However, the implication is the following:

  1. Currently, when we hit another ENTER_EXECUTOR (ie another executor is seen), we end with _DEOPT.
  2. _DEOPT goes back to the interpreter, and does not link executors.

Meanwhile, _EXIT_TRACE links executors. So if we see an ENTER_EXECUTOR and end with an _EXIT_TRACE, we end up linking from the first executor to the executor at the exit.

For Jitted code, the benefits of staying in jitted code without going from jit -> interpreter -> back to jit is enormous. It's one function call with lots of register moves.

@efimov-mikhail
Copy link
Member

Yes, I understand the purpose of this change. Meanwhile, thanks for the detailed clarification.

@Fidget-Spinner Fidget-Spinner merged commit ed73c90 into python:main Nov 15, 2025
97 of 99 checks passed
@Fidget-Spinner Fidget-Spinner deleted the fix_deopt branch November 15, 2025 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants