GH-139389: Do not track immutable tuples in PyTuple_Pack #139390

sergey-miryanov · 2025-09-28T07:41:59Z

When we use PyTuple_Pack all objects already well constructed. If we know that they immutable we can skip tracking it in GC, because GC will untrack them eventually.

I have a PR ready and benchmark results:

Geometric mean: 1.01x faster (Win11 x64, 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz, 48d0d0d)

All benchmarks:

+--------------------------+----------+------------------------+
| Benchmark                | main     | tuples                 |
+==========================+==========+========================+
| async_generators         | 435 ms   | 430 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| asyncio_tcp              | 750 ms   | 756 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| asyncio_tcp_ssl          | 1.91 sec | 1.92 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| comprehensions           | 22.1 us  | 21.8 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| bench_mp_pool            | 104 ms   | 103 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| bench_thread_pool        | 1.29 ms  | 1.27 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| coroutines               | 28.2 ms  | 27.7 ms: 1.02x faster  |
+--------------------------+----------+------------------------+
| coverage                 | 88.5 ms  | 86.3 ms: 1.02x faster  |
+--------------------------+----------+------------------------+
| crypto_pyaes             | 90.1 ms  | 86.7 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| deepcopy                 | 310 us   | 307 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| deepcopy_memo            | 36.4 us  | 36.1 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| deltablue                | 5.19 ms  | 4.85 ms: 1.07x faster  |
+--------------------------+----------+------------------------+
| django_template          | 45.5 ms  | 45.8 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| docutils                 | 2.47 sec | 2.45 sec: 1.01x faster |
+--------------------------+----------+------------------------+
| dulwich_log              | 86.2 ms  | 86.9 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| fannkuch                 | 449 ms   | 441 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| float                    | 85.3 ms  | 82.5 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| create_gc_cycles         | 1.17 ms  | 1.17 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| gc_traversal             | 2.97 ms  | 2.88 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| generators               | 43.0 ms  | 41.6 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| genshi_text              | 28.9 ms  | 28.7 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| go                       | 160 ms   | 153 ms: 1.04x faster   |
+--------------------------+----------+------------------------+
| hexiom                   | 8.39 ms  | 8.13 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| json_dumps               | 8.62 ms  | 8.69 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| logging_format           | 12.5 us  | 12.2 us: 1.02x faster  |
+--------------------------+----------+------------------------+
| logging_silent           | 139 ns   | 140 ns: 1.01x slower   |
+--------------------------+----------+------------------------+
| logging_simple           | 11.3 us  | 11.1 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| mako                     | 14.2 ms  | 14.4 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| mdp                      | 1.47 sec | 1.50 sec: 1.02x slower |
+--------------------------+----------+------------------------+
| meteor_contest           | 104 ms   | 102 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| nbody                    | 114 ms   | 113 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| pickle_pure_python       | 439 us   | 436 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| pprint_safe_repr         | 953 ms   | 916 ms: 1.04x faster   |
+--------------------------+----------+------------------------+
| pprint_pformat           | 1.95 sec | 1.88 sec: 1.04x faster |
+--------------------------+----------+------------------------+
| pyflate                  | 506 ms   | 492 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| python_startup           | 28.5 ms  | 27.4 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| python_startup_no_site   | 23.2 ms  | 22.2 ms: 1.05x faster  |
+--------------------------+----------+------------------------+
| raytrace                 | 361 ms   | 345 ms: 1.05x faster   |
+--------------------------+----------+------------------------+
| regex_compile            | 146 ms   | 146 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| regex_effbot             | 2.03 ms  | 2.02 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| regex_v8                 | 23.9 ms  | 22.7 ms: 1.06x faster  |
+--------------------------+----------+------------------------+
| richards                 | 66.1 ms  | 59.9 ms: 1.10x faster  |
+--------------------------+----------+------------------------+
| richards_super           | 71.6 ms  | 68.7 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| scimark_fft              | 300 ms   | 294 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| scimark_lu               | 135 ms   | 131 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| scimark_monte_carlo      | 83.3 ms  | 82.4 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| scimark_sor              | 157 ms   | 150 ms: 1.05x faster   |
+--------------------------+----------+------------------------+
| scimark_sparse_mat_mult  | 4.27 ms  | 4.35 ms: 1.02x slower  |
+--------------------------+----------+------------------------+
| spectral_norm            | 122 ms   | 118 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| sqlglot_optimize         | 60.7 ms  | 60.9 ms: 1.00x slower  |
+--------------------------+----------+------------------------+
| sympy_expand             | 501 ms   | 503 ms: 1.00x slower   |
+--------------------------+----------+------------------------+
| sympy_sum                | 143 ms   | 144 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| sympy_str                | 287 ms   | 292 ms: 1.02x slower   |
+--------------------------+----------+------------------------+
| telco                    | 7.26 ms  | 7.33 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| tomli_loads              | 2.23 sec | 2.25 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| typing_runtime_protocols | 189 us   | 185 us: 1.02x faster   |
+--------------------------+----------+------------------------+
| unpack_sequence          | 65.4 ns  | 68.7 ns: 1.05x slower  |
+--------------------------+----------+------------------------+
| unpickle                 | 13.9 us  | 14.1 us: 1.01x slower  |
+--------------------------+----------+------------------------+
| unpickle_pure_python     | 303 us   | 300 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| xml_etree_parse          | 130 ms   | 130 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| xml_etree_iterparse      | 107 ms   | 108 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| xml_etree_process        | 79.2 ms  | 78.6 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| Geometric mean           | (ref)    | 1.01x faster           |
+--------------------------+----------+------------------------+

Benchmark hidden because not significant (20): 2to3, chaos, deepcopy_reduce, genshi_xml, html5lib, json_loads, nqueens, pathlib, pickle, pickle_dict, pickle_list, pidigits, regex_dna, sqlglot_normalize, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_integrate, unpickle_list, xml_etree_generate

It doesn't hurt performance, but can decrease number of objects in GC to check and untrack.

Issue: Do not track immutable tuples in PyTuple_Pack #139389

sergey-miryanov · 2025-09-28T07:44:20Z

I'm not sure that this needs a NEWS entry because it is an implementation detail. But I'm here on triage/core decision.

sergey-miryanov · 2025-09-28T07:44:34Z

Sorry, misclick.

ZeroIntensity · 2025-09-28T15:23:57Z

PyTuple_Pack is user-facing, so let's add a blurb.

eendebakpt · 2025-09-28T20:20:51Z

Which cases would benefit from this change to PyTuple_Pack? We could apply a similar optimization to some other methods to construct a tuple (e.g. _PyTuple_FromArray, or inside dictiter_iternextitem), and it is not clear to me directly which ones would benefit (no gc tracking), and which ones would not (expensive to check all the arguments)

PyTuple_Pack itself is slow (due to varargs), so there should not be a lot of cases where this is used often.

Note: I used this code to check which cases are impacted

import gc
import random

a = (1, 2, 3, 4)
b = tuple([1,2,3])
c = tuple(list([1,2,3])) 
d = (2,3) * (random.randint(2, 4)+2)
          
print(f'{gc.is_tracked(a)=}') # still tracked
print(f'{gc.is_tracked(b)=}')
print(f'{gc.is_tracked(c)=}')
print(f'{gc.is_tracked(d)=}')

d={10: 1, 11: 2}

one_dict_item = next(iter(d.items()))
print(f'{gc.is_tracked(one_dict_item)=}') # still tracked!

l=[]
l.extend(d.items())
print(f'{gc.is_tracked(l[0])=}') # here it works! l[0] is not tracked

for tp in d.items():
    print(f'{gc.is_tracked(tp)=}')  # tracked, even though this might be a common use case

sergey-miryanov · 2025-09-29T05:39:50Z

@eendebakpt Yeah, I did the same for:

_PyTuple_FromArray
_PyTuple_FromStackRefStealOnSuccess
_PyTuple_FromArraySteal
tuple_concat
tuple_repeat
tuple_subscript

Collecting stats and microbenchmarking now.

sergey-miryanov · 2025-09-29T07:40:10Z

Tests fail because instrumentation a bit straightforward and work on my machine

Do not track immutable tuples in PyTuple_Pack

fbb7342

bedevere-app bot added the awaiting review label Sep 28, 2025

bedevere-app bot mentioned this pull request Sep 28, 2025

Do not track immutable tuples in PyTuple_Pack #139389

Open

sergey-miryanov closed this Sep 28, 2025

sergey-miryanov reopened this Sep 28, 2025

Extra stats

be72b3f

Implement for all

04f0f66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GH-139389: Do not track immutable tuples in PyTuple_Pack #139390

GH-139389: Do not track immutable tuples in PyTuple_Pack #139390

Uh oh!

sergey-miryanov commented Sep 28, 2025 •

edited

Loading

Uh oh!

sergey-miryanov commented Sep 28, 2025

Uh oh!

sergey-miryanov commented Sep 28, 2025

Uh oh!

ZeroIntensity commented Sep 28, 2025

Uh oh!

eendebakpt commented Sep 28, 2025

Uh oh!

sergey-miryanov commented Sep 29, 2025

Uh oh!

sergey-miryanov commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

GH-139389: Do not track immutable tuples in PyTuple_Pack #139390

Are you sure you want to change the base?

GH-139389: Do not track immutable tuples in PyTuple_Pack #139390

Uh oh!

Conversation

sergey-miryanov commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

All benchmarks:

Uh oh!

sergey-miryanov commented Sep 28, 2025

Uh oh!

sergey-miryanov commented Sep 28, 2025

Uh oh!

ZeroIntensity commented Sep 28, 2025

Uh oh!

eendebakpt commented Sep 28, 2025

Uh oh!

sergey-miryanov commented Sep 29, 2025

Uh oh!

sergey-miryanov commented Sep 29, 2025

Uh oh!

Uh oh!

sergey-miryanov commented Sep 28, 2025 •

edited

Loading