Description
Bug report
Bug description:
Summary
I am currently benchmarking the performance of the experimental --disable-gil
mode in Python 3.14b2.
In most CPU-bound cases, I observed excellent improvements (×3–×4 speedups). However, in one more realistic test case, performance significantly drops when using --disable-gil
.
This issue is not really a bug report, but a performance feedback and an invitation to discuss whether this behavior is expected.
Reproducible test case
Nearest neighbor search using few threads over a list of 10 million 3D points (list[list[float]]
).
The algorithm:
- splits the search range evenly across threads
- uses a shared
dict
to store the result with athreading.Lock
- each thread compares
distance2(p, query_point)
and updates shared data if needed
Repository with code and Dockerfile:
https://github.com/basileMarchand/benchmark_python3.14_nogil
Benchmark article:
https://dev.to/basilemarchand/benchmarks-of-python-314b2-with-disable-gil-1ml3
🔢 Results
Mode | Elapsed Time |
---|---|
Python GIL ON | 2.38 s |
Python GIL OFF | 3.61 s ❗ |
C++ + threads | 0.88 s ✅ |
Question
This test is the only case in my benchmarks that runs slower with --disable-gil
, even though it is purely CPU-bound.
Is this performance drop expected with current implementation of nogil
?
Any hints on what could explain it? (lock contention, list access overhead, memory locality, etc.)
Environment
- Python: 3.14.0b2, compiled with
--disable-gil
- Platform: Docker / Linux (see repo for details)
- Memory: 10M Python objects (list of lists)
Thanks for your amazing work on nogil
— it’s a huge leap forward 🚀
CPython versions tested on:
3.14
Operating systems tested on:
Linux