Skip to content

Conversation

ashm-dev
Copy link
Contributor

@ashm-dev ashm-dev commented Oct 11, 2025

There is a memory leak in make_executor_from_uops (Python/optimizer.c) when JIT compilation fails. The executor object is created via _PyObject_GC_NewVar but _PyObject_GC_TRACK is called after the JIT compilation attempt. If _PyJIT_Compile fails and returns an error, the code calls Py_DECREF(executor) before the object has been tracked by the garbage collector.

@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

Note that this wouldn't solve the cold executor issue. It would only prevent an assertion failure in case we branch here (but if we branch here, we're already in a very bad state)

@picnixz picnixz changed the title gh-139540: Fix memory leak in make_executor_from_uops when JIT compilation fails gh-139540: Fix executor deallocation crash when JIT compilation fails Oct 11, 2025
@ashm-dev
Copy link
Contributor Author

@picnixz Thanks for the context regarding the "cold executor issue".
Just to clarify the scope of this PR: my understanding is that this is a larger, separate problem to be handled in a follow-up. Is that correct, or should I look into addressing it here as well?

@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

It's a larger and more complex issue. The fact that we have an immortal cold executor seems to make normal executors leak. I don't know if it's because we reset executors in exit sides to a cold executor that then makes that specific executor not garbage-collectable (this is concerning). Your PR would still be fine but I don't know if it won't be eventually rechanged because of the fix we need to do on the other side.

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Oct 11, 2025

Sorry but I have the same PR up here https://github.com/python/cpython/pull/137016/files, since July of this year

The executor object wasn't decremented if Tier 2 code returned with an
exception set but not a fatal error.

This change moves the Py_DECREF call inside the TIER1_TO_TIER2 macro.
This ensures the executor's reference count is always decremented.
@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

Considering this is a duplicate of gh-137016, I'm closing this PR. Sorry!

@picnixz picnixz closed this Oct 11, 2025
@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

For the other leaks, please open a separate PR. please wait for Mark and other JIT experts to give their opinion because we can't jump on the task without understanding how to properly solve it.

if (cold == NULL) {
Py_FatalError("Cannot allocate core JIT code");
}
_PyObject_GC_TRACK(cold);
Copy link
Member

@picnixz picnixz Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. The executor is immortal for now.

No this was correct, my bad. I didn't read the function correctly. The problem is that we're setting it immortal afterwards and I'm not sure it will be happy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks for the correction.

The intent was to track it for consistency immediately after allocation. My assumption is the GC correctly handles a tracked object that later becomes immortal.

Is that a safe assumption, or should we explicitly _PyObject_GC_UNTRACK it just before immortalization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That I don't know. The problem here is whether we want an immortal cold executor or not. That's why I wanted to wait for Mark's output here, and didn't write a PR already.

@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

@Fidget-Spinner Do you mind incorporating some observations from this PR in yours? I don't know whether the change is correct here when changing current_executor etc.

@ashm-dev
Copy link
Contributor Author

Hi @Fidget-Spinner ,

Thanks for pointing that out and linking your PR. I've taken a close look at both.

While our changes overlap in optimizer.c, they actually address two different underlying issues. Your PR #137016 correctly fixes a GC assertion failure by moving _PyObject_GC_TRACK to an earlier point. It's a great catch!

However, this PR primarily addresses a critical reference counting leak that was detectable with ASan. The root cause was that the executor's refcount was not decremented when Tier 2 code exited with a Python exception set (but didn't fatally error out by returning NULL). The main fix here is adding Py_DECREF(EXECUTOR) inside the TIER1_TO_TIER2 macro in ceval_macros.h to ensure the reference is always released.

My PR also includes the same _PyObject_GC_TRACK move that you proposed, so it appears to fix both the memory leak from issue #139540 and the GC assertion issue that your PR addresses.

I believe this PR provides a more complete solution for the stability issues in this code path.

@ashm-dev
Copy link
Contributor Author

I don't know whether the change is correct here when changing current_executor etc

These changes are correct and necessary. tstate->current_executor acts as a temporary channel to pass the executor object from the _ENTER_EXECUTOR instruction to the new Tier 2 frame, which is set up in _PyEval_EvalFrameDefault.

The intended lifecycle is:

  1. _ENTER_EXECUTOR (in bytecodes.c) sets tstate->current_executor.
  2. _PyEval_EvalFrameDefault consumes it at the very start of the new frame, creates a PyStackRef for it, and immediately sets tstate->current_executor back to NULL.

The other NULL/CLEAR additions in ceval.c and pystate.c are for robustness, ensuring we don't leak the executor if a frame or thread is torn down unexpectedly. This appears to be the established design for this handover.

@picnixz
Copy link
Member

picnixz commented Oct 11, 2025

The root cause was that the executor's refcount was not decremented when Tier 2 code exited with a Python exception set (but didn't fatally error out by returning NULL). The main fix here is adding Py_DECREF(EXECUTOR) inside the TIER1_TO_TIER2 macro in ceval_macros.h to ensure the reference is always released.

Does it fix the leak as in the issue? that is, does it fix the refcounting? I can't check this now, but please check whether the reproducer (without ASAN) is fixed when you run ./python -X showrefcount repro.py.

@picnixz picnixz reopened this Oct 11, 2025
#define TIER1_TO_TIER2(EXECUTOR) \
#define TIER1_TO_TIER2(EXECUTOR) \
do { \
OPT_STAT_INC(traces_executed); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed? please revert.

/* Tier-switching macros. */

#define TIER1_TO_TIER2(EXECUTOR) \
#define TIER1_TO_TIER2(EXECUTOR) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed? please revert.

Comment on lines +7671 to +7673
tstate->current_executor = (PyObject *)executor;
_PyFrame_SetStackPointer(frame, stack_pointer);
stack_pointer = _PyFrame_GetStackPointer(frame);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are setting the stack pointer and requerying it while doing nothing inbetween.

next_instr = _Py_jit_entry((EXECUTOR), frame, stack_pointer, tstate); \
frame = tstate->current_frame; \
stack_pointer = _PyFrame_GetStackPointer(frame); \
Py_DECREF(EXECUTOR); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we decref it here?

assert(tstate->current_executor == NULL);
assert(executor != tstate->interp->cold_executor);
tstate->jit_exit = NULL;
tstate->current_executor = (PyObject *)executor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just to hold a reference to the executor?

@ashm-dev ashm-dev marked this pull request as draft October 11, 2025 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants