Skip to content

Thread.start() can hang indefinitely if the new thread fails (MemoryError) during its initialization #140746

@ryv-odoo

Description

@ryv-odoo

Bug report

Bug description:

There is a race condition in the threading module where a parent thread calling Thread.start() can wait forever if the newly created thread crashes with a MemoryError during its internal bootstrap process.

Case Explanation

In case we have a "Serving Thread" that creates threads on demand (e.g., for each HTTP request):

  • When this Serving Thread calls Thread.start(), the OS-level thread (pthread_create()1 - Linux) is successfully created. The parent thread then waits for the new thread to signal that it has started correctly by calling self._started.wait()2.
  • The new thread starts, but before it can signal the parent thread (Serving Thread) that it is alive with self._started.set()3 it encounters a MemoryError.
  • This MemoryError can occur at the C level during the PyObject_Call to the _bootstrap method or inside the _bootstrap_inner method before _started.set() is reached, often due to memory pressure from other threads (or a heap limit being reached).
  • This exception is caught by the C-level entry point thread_run(), which calls _PyErr_WriteUnraisableMsg4 and prints "Exception ignored in thread started by: ..."

The new thread then exits without ever signaling the _started event, and the parent thread waits indefinitely on the _started.wait().
This also leaves the threading module in an inconsistent state, as the "zombie" thread object may not be correctly cleaned up from the _limbo dict.

How to Reproduce

This has been observed in high-concurrency server applications under heavy, sustained load, where heap memory can be rapidly consumed and exhausted by concurrent threads 5.
This is a race condition that is difficult to reproduce reliably, as it requires triggering a MemoryError at a specific moment.

I found a (deterministic?) way to reproduce the issue by restricting the heap memory until it reaches a threshold where we can start a new thread, but this new thread won't get enough memory for its initialization.

On some machines (and depending of Python versions), it is sometimes necessary to tweak HARD_LIMIT_START / LIMIT_REDUCTION (I reproduced it on Ubuntu based machine with Python 3.11/3.12/3.13/3.14)

import resource
import threading
import gc

def handler():
    pass

def serving():
    # These should be tweak (depending of Python version + system)
    HARD_LIMIT_START = 30_000_000
    LIMIT_REDUCTION = 5_000

    for _ in range(500_000):
        gc.collect(2)  # Force getting back memory: seems to increase the determinism of the script

        # Limit the heap size available for this process
        resource.setrlimit(resource.RLIMIT_DATA, (HARD_LIMIT_START, HARD_LIMIT_START * 2))
        try:
            handler_thread = threading.Thread(target=handler)
            print(f'Start Thread: {handler_thread} - Heap size limit : {HARD_LIMIT_START}')
            handler_thread.start()
            handler_thread.join()
            HARD_LIMIT_START -= LIMIT_REDUCTION
        except RuntimeError as r:  # If Python refused to launch a new Thread
            print(f'RuntimeError: {r} - Cannot start the thread at all => error not detected.')
            return

serving_thread = threading.Thread(target=serving)
serving_thread.start()
serving_thread.join()

Expected Behavior

I am not sure if this is an "accepted" limitation of (CPython) Thread or not. IMO, Thread.start() shouldn't hang indefinitely if the low-level thread is dead.

I didn't take the time to try to fix it yet (if possible). I would prefer to get your opinions on this first.

CPython versions tested on:

3.12, 3.13, 3.14

Operating systems tested on:

Linux

Linked PRs

Footnotes

  1. https://github.com/python/cpython/blob/25243b1461e524560639ebe54bab9b689b6cc31e/Python/thread_pthread.h#L284

  2. https://github.com/python/cpython/blob/f463d05a0979aada4fadcd43ff721b1ff081d2aa/Lib/threading.py#L999

  3. https://github.com/python/cpython/blob/f463d05a0979aada4fadcd43ff721b1ff081d2aa/Lib/threading.py#L1064

  4. https://github.com/python/cpython/blob/89a79fc919419bfe817da13bc2a4437908d7fc07/Modules/_threadmodule.c#L1122

  5. https://github.com/odoo/odoo/blob/350f7b10d4048b84d4a5f9e5aca9a88b8b971801/odoo/service/server.py#L273

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions