Skip to content

[3.14t] Free-threaded GC performance regression on tuples #142531

@MaxiBoether

Description

@MaxiBoether

Bug report

Bug description:

I am observing a significant performance regression in the Python 3.14.2 Free-Threading build compared to the build with GIL when constructing large lists of tuples.

This seems related to the recent work in issue #139951 and PR #140262. While the Standard 3.14.2 build appears to have resolved the performance regression, the Free-Threading build is still significantly slower in this scenario. I have confirmed that 3.14.0 (regular builld) shows the performance regression, 3.14.2 (regular) seems to have fixed it (thanks to the previous PR) but 3.14.2t (at least the version shipped on uv) still contains the issue.

I also see what is the output of gc.is_tracked (see below) for free threaded vs standard, there seems to be a difference but I did not dig into the details so far on how GC works on standard vs free threaded Python, sorry.

Benchmark Results (N=15,000,000) (r6id.8xlarge AWS instance)

Build GC State Time gc.is_tracked says
3.14.0 (Standard) Enabled 142.876s True
3.14.0 (Standard) Disabled 1.068s True
3.14.0 (Free-Threaded) Enabled 37.074s False
3.14.0 (Free-Threaded) Disabled 0.637s False
3.14.2 (Standard) Enabled 0.974s True
3.14.2 (Standard) Disabled 0.969s True
3.14.2 (Free-Threaded) Enabled 36.679s False
3.14.2 (Free-Threaded) Disabled 0.644s False

Reproduction Script

import time
import gc
import sys

def benchmark_step(disable_gc):
    if disable_gc: 
        gc.disable()
    
    t0 = time.perf_counter()
    _ = [(0, 0, i) for i in range(15_000_000)]
    dt = time.perf_counter() - t0
    
    status = "GC DISABLED" if disable_gc else "GC ENABLED "
    print(f"{status}: {dt:.3f}s")
    
    if disable_gc:
        gc.enable()

if __name__ == "__main__":
    print(f"Python Version: {sys.version}")
    sample = (0, 0, 1)
    print(f"Is tuple(int, int, int) tracked? {gc.is_tracked(sample)}")
    
    print("--- Benchmark ---")
    benchmark_step(disable_gc=False)
    benchmark_step(disable_gc=True)

In more detail this is my 3.14.2 regular build:
Python Version: 3.14.2 (main, Dec 9 2025, 19:03:28) [Clang 21.1.4 ]

And my free threaded build:
Python Version: 3.14.2 free-threading build (main, Dec 9 2025, 19:03:17) [Clang 21.1.4 ]

It's particularly weird that when disabling GC (despite the tuple being untracked) the regression goes away... Is it possible there is a disconnect between the flag is_tracked checks and the actual logic used by the free-threaded collector?

cc @colesbury @nascheme @markshannon

CPython versions tested on:

3.14

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.14bugs and security fixes3.15new features, bugs and security fixesperformancePerformance or resource usagetopic-free-threadingtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions