Skip to content

Conversation

yihong0618
Copy link
Contributor

@yihong0618 yihong0618 commented Sep 4, 2025

debug process:

First step, of course, was trying to reproduce it. On 3.13, running the statement from the issue in the REPL hangs. On 3.14, it doesn’t. So it’s a 3.13-only problem.

  • Running ./python xxx.py directly? Works fine
  • Using pty? Works fine
  • Using subprocess? Works fine
  • only in the REPL?Force the old REPL with PYTHON_BASIC_REPL=1 ./python → also fine!
  • Is it a C-level bug or a Python-level bug? Since it only happens in the new REPL, at first I thought it was Python-level.
  • I wrapped every possible memory-error spot in except: — nothing worked.
  • After more print-debugging, I found the culprit: exec(code, self.locals) in console.py.
  • OK, so the bug simplifies to: running exec("_testcapi.set_nomemory(0)") in the new REPL.
  • maybe it’s not actually 3.13-only?
  • Fired up a git branch build — turns out I was right! 3.14’s very first release still had the bug.
  • Since branch search works, I suddenly realized I could use git bisect.
  • using bisect fixing commit:5fc6bb2754a25157575efc0b37da78c629fea46e.
  • cherry-pick it? Nope — the changes were way too big.
  • gdb found its a recursive in error if no memory and with the fix

cc @ZeroIntensity you can try this patch in 3.13

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I didn't think anyone would ever end up taking a stab at this. First and foremost, please add a test case (more on that in a second) and blurb entry, as this is user-facing.

Now, I am a little surprised to see this in the eval loop and not somewhere in _pyrepl (or perhaps Py_Main). That should mean that there's a reproducer available without the REPL -- would you mind finding that, and then using it for a test case?

yihong0618 and others added 2 commits September 4, 2025 11:52
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 4, 2025

help with gpt I wrote a reproduce python script

def simulate_repl_exec():
    print("=" * 60)
    print("Reproducing issue #134163 outside REPL")
    print("Simulating interactive interpreter exec() behavior")
    print("=" * 60)
    print()

    # First, import and set up the memory failure condition
    print("Step 1: Setting up memory allocation failure...")
    import _testcapi

    print("Step 2: Preparing code that will trigger exception handling...")
    # Create a code object that will cause exception handling
    # This simulates what happens when REPL executes user input
    test_code = """
# This code will trigger the problematic code path
# by causing an exception during execution when memory is constrained
import _testcapi
_testcapi.set_nomemory(0)  # This line triggers the hang condition

# Any subsequent operation that might need memory allocation
# will trigger the exception handling path where the bug occurs
x = [1, 2, 3]  # This should trigger MemoryError
"""
    print("Step 3: Compiling test code...")
    try:
        compiled_code = compile(test_code, "<reproduce_script>", "exec")
    except Exception as e:
        print(f"Compilation failed: {e}")
        return
    print("Step 4: Executing code that triggers the hang condition...")
    print("BEFORE FIX: This would hang indefinitely")
    print("AFTER FIX: This should exit gracefully")
    print()
    try:
        exec(compiled_code, {"__name__": "__console__"})
        print("Code executed successfully (unexpected)")
    except SystemExit:
        print("SystemExit caught - re-raising")
        raise
    except Exception as e:
        print(f"Exception caught during exec(): {type(e).__name__}: {e}")
        print("This is the expected path - exception handling should work normally")
        # The showtraceback() equivalent would be called here in real REPL
        import traceback

        traceback.print_exc()
if __name__ == "__main__":
    print("This script reproduces the hang condition outside of REPL")
    simulate_repl_exec()

note the print can not drop....

This is very strange if I drop some print I can not reproduce it....

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Contributor Author

test added

3.13 hang
image

my branch passed
image

""")
p = spawn_repl()
with SuppressCrashReport():
p.stdin.write(user_input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still just running it in the REPL. For now, let's switch to the simple repro of import _testcapi; _testcapi.set_nomemory(0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not right if so
...it will not hang for 3.13
the test is the same as

    @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build')
    def test_no_memory(self):
        import_module("_testcapi")
        # Issue #30696: Fix the interactive interpreter looping endlessly when
        # no memory. Check also that the fix does not break the interactive
        # loop when an exception is raised.
        user_input = """
            import sys, _testcapi
            1/0
            print('After the exception.')
            _testcapi.set_nomemory(0)
            sys.exit(0)
        """
        user_input = dedent(user_input)
        p = spawn_repl()
        with SuppressCrashReport():
            p.stdin.write(user_input)
        output = kill_python(p)
        self.assertIn('After the exception.', output)
        # Exit code 120: Py_FinalizeEx() failed to flush stdout and stderr.
        self.assertIn(p.returncode, (1, 120))

if we use test
import _testcapi; _testcapi.set_nomemory(0).
it always pass with or without the patch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant that we don't need spawn_repl here. subprocess.run([sys.executable, '-c', user_input]) should work as a test -- there's no need to involve the REPL.

@yihong0618
Copy link
Contributor Author

the reason is that:

When there are no four print statements, the Python command is as follows:
$ ./python -m dis -O xxx.py (as above)
...
36 L3: 196 LOAD_GLOBAL 9 (exec + NULL)
206 LOAD_FAST 2 (compiled_code)
208 LOAD_CONST 12 ('name')
210 LOAD_CONST 13 ('console')
210 LOAD_CONST 13 ('console')
212 BUILD_MAP 1
214 CALL 2
222 POP_TOP

So, it can be observed that the instruction before executing set_nomemory(0) is around 208. When I used GDB to monitor:

int frame_lasti = _PyInterpreterFrame_LASTI(frame);
PyObject *lasti = PyLong_FromLong(frame_lasti);
if (lasti == NULL) {
    goto exception_unwind;
}

The frame_lasti here is 208, which is a small integer. Subsequently, when calling PyLong_FromLong, malloc is not invoked, so no MemError is triggered.

However, once print statements are added, the Python instructions become:

text
36 L3: 304 LOAD_GLOBAL 9 (exec + NULL)
314 LOAD_FAST 2 (compiled_code)
316 LOAD_CONST 15 ('name')
318 LOAD_CONST 16 ('console')
320 BUILD_MAP 1
322 CALL 2
330 POP_TOP
The offset increases from 208 to around 316, which just exceeds CPython's small integer limit of 257:

#define _PY_NSMALLPOSINTS           257

At this point, PyLong_FromLong(frame_lasti) triggers malloc, causing a memory error and subsequently leading to an infinite loop.

Well, it's not exactly coincidental, but the general idea is that adding print statements increases the value of frame_lasti. When this value exceeds the small integer limit, malloc is called, triggering an error and resulting in an infinite loop.

The Python REPL has a complex implementation. It likely runs a series of instructions during initialization. When I checked with GDB, frame_lasti was 409, which exceeds the small integer limit.

So, this should indeed be the root cause.

so if we run a fake repl it will not hang it is not into the recursvie

if a spawn_repl can use test, the below is already hang
image

cc @ZeroIntensity

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Contributor Author

btw I tested 3.12 there's no such problem it only exists in 3.13 -> 3.14a2

@ZeroIntensity
Copy link
Member

Ah, that makes sense. Sounds like we can definitely repro without the prints, though.

@yihong0618
Copy link
Contributor Author

Ah, that makes sense. Sounds like we can definitely repro without the prints, though.

yes, going to try

@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 4, 2025

def simulate_repl_exec():
    print("=" * 60)
    print("Reproducing issue #134163 outside REPL")
    print("Simulating interactive interpreter exec() behavior")
    print("=" * 60)
    print()

    # First, import and set up the memory failure condition
    print("Step 1: Setting up memory allocation failure...")
    import _testcapi

    print("Step 2: Preparing code that will trigger exception handling...")
    # Create a code object that will cause exception handling
    # This simulates what happens when REPL executes user input
    test_code = """
# This code will trigger the problematic code path
# by causing an exception during execution when memory is constrained
import _testcapi
_testcapi.set_nomemory(0)  # This line triggers the hang condition

# Any subsequent operation that might need memory allocation
# will trigger the exception handling path where the bug occurs
x = [1, 2, 3]  # This should trigger MemoryError
"""
    print("Step 3: Compiling test code...")
    try:
        compiled_code = compile(test_code, "<reproduce_script>", "exec")
    except Exception as e:
        print(f"Compilation failed: {e}")
        return
    print("Step 4: Executing code that triggers the hang condition...")
    print("BEFORE FIX: This would hang indefinitely")
    print("AFTER FIX: This should exit gracefully")
    print()
    try:
        exec(compiled_code, {"__name__": "__console__"})
        print("Code executed successfully (unexpected)")
    except SystemExit:
        print("SystemExit caught - re-raising")
        raise
    except Exception as e:
        print(f"Exception caught during exec(): {type(e).__name__}: {e}")
        print("This is the expected path - exception handling should work normally")
        # The showtraceback() equivalent would be called here in real REPL
        import traceback

        traceback.print_exc()
if __name__ == "__main__":
    print("This script reproduces the hang condition outside of REPL")
    simulate_repl_exec()

a simple way can you try:

14 times malloc
why 1000-2000 to ignore small int memory

a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
a = list(range(1000, 2000))
try:
    import _testcapi
    _testcapi.set_nomemory(0)
    b = list(range(1000, 2000)) # one time again
except Exception as e:
    import traceback

    traceback.print_exc()

update

list(range(0,1 ) can also trigger the bug

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 5, 2025

@ZeroIntensity
Done and I do not find a way to save some time for malloc in my test it at least 14 times + 1 time
and its not a repl only bug

if you have better way to save some code you can continue commit

and I tested 3.13 hang in test

in the branch passed ok

yihong0618 and others added 3 commits September 5, 2025 09:24
…e-134163.EqKyn8.rst

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Contributor Author

addressed thank you

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close!

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit 52310f0 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F138491%2Fmerge

The command will test the builders whose names match following regular expression: tracerefs

The builders matched are:

  • AMD64 Arch Linux TraceRefs PR

@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 9, 2025

What....we set pymemory err here, it still fail...

@yihong0618
Copy link
Contributor Author

reproduced we must set configure with
./configure --with-pydebug --with-trace-refs

>>> _testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffff9c188020 (most recent call first):
  File "/home/hyi/prs/cpython/Lib/code.py", line 203 in resetbuffer
  File "/home/hyi/prs/cpython/Lib/code.py", line 316 in push
  File "/home/hyi/prs/cpython/Lib/_pyrepl/simple_interact.py", line 154 in run_multiline_interactive_console
  File "/home/hyi/prs/cpython/Lib/_pyrepl/main.py", line 59 in interactive_console
  File "/home/hyi/prs/cpython/Lib/_pyrepl/__main__.py", line 10 in <module>
  File "/home/hyi/prs/cpython/Lib/runpy.py", line 88 in _run_code
  File "/home/hyi/prs/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1)
Aborted

@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 9, 2025

now all things clear cc @ZeroIntensity

we also have this issue in 3.13 or 3.14 even main
we need to add -6 to the return code
Python 3.13.7+ (heads/3.13:9a6137ad4b1, Sep 9 2025, 09:42:29) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffff9c52e020 (most recent call first):
File "/home/hyi/cpython/Lib/code.py", line 203 in resetbuffer
File "/home/hyi/cpython/Lib/code.py", line 316 in push
File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 154 in run_multiline_interactive_console
File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 59 in interactive_console
File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 10 in
File "/home/hyi/cpython/Lib/runpy.py", line 88 in _run_code
File "/home/hyi/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1)
Aborted
[hyi@rocky cpython]$ git branch -a
3.12

  • 3.13

Python 3.14.0rc2+ (heads/3.14:161543f4f11, Sep 9 2025, 09:43:56) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffffa00c6020 [python] (most recent call first):
File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 181 in runcode
File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 226 in runsource
File "/home/hyi/cpython/Lib/code.py", line 324 in push
File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 151 in run_multiline_interactive_console
File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 58 in interactive_console

Extension modules: _testcapi (total: 1)
Aborted
[hyi@rocky cpython]$ git branch -a
3.12
3.13

  • 3.14
    main
    remotes/origin/3.10
    remotes/origin/3.11
    remotes/origin/3.12
    remotes/origin/3.13
    remotes/origin/3.14
    remotes/origin/3.9
    remotes/origin/HEAD -> origin/main
    remotes/origin/main

Python 3.15.0a0 (heads/main:5edfe55acf2, Sep 9 2025, 09:49:31) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
^[[A>>> import _testcapi
KeyboardInterrupt

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffffa2ca8020 [python] (most recent call first):
File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 182 in runcode
File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 239 in runsource
File "/home/hyi/cpython/Lib/code.py", line 324 in push
File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 151 in run_multiline_interactive_console
File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 58 in interactive_console

Extension modules: _testcapi (total: 1)
Aborted

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Contributor Author

now all things clear cc @ZeroIntensity

we also have this issue in 3.13 or 3.14 even main we need to add -6 to the return code Python 3.13.7+ (heads/3.13:9a6137ad4b1, Sep 9 2025, 09:42:29) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux Type "help", "copyright", "credits" or "license" for more information.

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffff9c52e020 (most recent call first): File "/home/hyi/cpython/Lib/code.py", line 203 in resetbuffer File "/home/hyi/cpython/Lib/code.py", line 316 in push File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 154 in run_multiline_interactive_console File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 59 in interactive_console File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 10 in File "/home/hyi/cpython/Lib/runpy.py", line 88 in _run_code File "/home/hyi/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1) Aborted [hyi@rocky cpython]$ git branch -a 3.12

  • 3.13

Python 3.14.0rc2+ (heads/3.14:161543f4f11, Sep 9 2025, 09:43:56) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux Type "help", "copyright", "credits" or "license" for more information.

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffffa00c6020 [python] (most recent call first): File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 181 in runcode File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 226 in runsource File "/home/hyi/cpython/Lib/code.py", line 324 in push File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 151 in run_multiline_interactive_console File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 58 in interactive_console

Extension modules: _testcapi (total: 1) Aborted [hyi@rocky cpython]$ git branch -a 3.12 3.13

  • 3.14
    main
    remotes/origin/3.10
    remotes/origin/3.11
    remotes/origin/3.12
    remotes/origin/3.13
    remotes/origin/3.14
    remotes/origin/3.9
    remotes/origin/HEAD -> origin/main
    remotes/origin/main

Python 3.15.0a0 (heads/main:5edfe55acf2, Sep 9 2025, 09:49:31) [GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux Type "help", "copyright", "credits" or "license" for more information. ^[[A>>> import _testcapi KeyboardInterrupt

import _testcapi
_testcapi.set_nomemory(0)
Fatal Python error: _PyRefchain_Trace: _Py_hashtable_set() memory allocation failed
Python runtime state: initialized

Current thread 0x0000ffffa2ca8020 [python] (most recent call first): File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 182 in runcode File "/home/hyi/cpython/Lib/_pyrepl/console.py", line 239 in runsource File "/home/hyi/cpython/Lib/code.py", line 324 in push File "/home/hyi/cpython/Lib/_pyrepl/simple_interact.py", line 151 in run_multiline_interactive_console File "/home/hyi/cpython/Lib/_pyrepl/main.py", line 58 in interactive_console

Extension modules: _testcapi (total: 1) Aborted

update we already meet this
use exits code can avoid this
already fix in the latest commit

    # Python built with Py_TRACE_REFS fail with a fatal error in
    # _PyRefchain_Trace() on memory allocation error.
    @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build')

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, this looks good to me. I've left two minor comment changes as suggestions below. I'll run the buildbots one more time and then I'll merge this. Thanks for being responsive and respectful to my feedback :)

@ZeroIntensity ZeroIntensity added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 10, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit 5ab08c7 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F138491%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 10, 2025
@yihong0618
Copy link
Contributor Author

Alas, this looks good to me. I've left two minor comment changes as suggestions below. I'll run the buildbots one more time and then I'll merge this. Thanks for being responsive and respectful to my feedback :)

Thank you learned a lot from this.

@yihong0618
Copy link
Contributor Author

yihong0618 commented Sep 10, 2025

test failed seems not about this patch merge the review comments first

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
@ZeroIntensity ZeroIntensity merged commit afec6a5 into python:3.13 Sep 10, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants