Slower performance with --disable-gil on threaded nearest-neighbor benchmark (10M points)

# Bug report

### Bug description:


## Summary

I am currently benchmarking the performance of the experimental `--disable-gil` mode in Python 3.14b2.  
In most CPU-bound cases, I observed excellent improvements (×3–×4 speedups). However, in one more realistic test case, performance significantly **drops** when using `--disable-gil`.

This issue is not really a bug report, but a performance feedback and an invitation to discuss whether this behavior is expected.

---

## Reproducible test case

**Nearest neighbor search** using few threads over a list of 10 million 3D points (`list[list[float]]`).

The algorithm:
- splits the search range evenly across threads
- uses a shared `dict` to store the result with a `threading.Lock`
- each thread compares `distance2(p, query_point)` and updates shared data if needed

### Repository with code and Dockerfile: 

https://github.com/basileMarchand/benchmark_python3.14_nogil

### Benchmark article:

 https://dev.to/basilemarchand/benchmarks-of-python-314b2-with-disable-gil-1ml3

---

## 🔢 Results

| Mode           | Elapsed Time |
|----------------|--------------|
| Python GIL ON  | 2.38 s       |
| Python GIL OFF | 3.61 s ❗     |
| C++ + threads  | 0.88 s ✅     |

---

## Question

This test is the only case in my benchmarks that runs **slower** with `--disable-gil`, even though it is purely CPU-bound.

Is this performance drop expected with current implementation of `nogil`?

Any hints on what could explain it? (lock contention, list access overhead, memory locality, etc.)

---

## Environment

- Python: 3.14.0b2, compiled with `--disable-gil`
- Platform: Docker / Linux (see repo for details)
- Memory: 10M Python objects (list of lists)

Thanks for your amazing work on `nogil` — it’s a huge leap forward 🚀




### CPython versions tested on:

3.14

### Operating systems tested on:

Linux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Slower performance with --disable-gil on threaded nearest-neighbor benchmark (10M points) #135454

Bug report

Bug description:

Summary

Reproducible test case

Repository with code and Dockerfile:

Benchmark article:

🔢 Results

Question

Environment

CPython versions tested on:

Operating systems tested on:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mode	Elapsed Time
Python GIL ON	2.38 s
Python GIL OFF	3.61 s ❗
C++ + threads	0.88 s ✅

Uh oh!

Slower performance with --disable-gil on threaded nearest-neighbor benchmark (10M points) #135454

Description

Bug report

Bug description:

Summary

Reproducible test case

Repository with code and Dockerfile:

Benchmark article:

🔢 Results

Question

Environment

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions