Skip to content

nogil multi-threading is slower than multi-threading with gil for CPU bound #118749

Closed as not planned
@Venkat2811

Description

@Venkat2811

Bug report

Bug description:

Hello Team,

Thanks for the great work so far and recent nogil efforts. I wanted to explore performance of asyncio with CPU-GPU bound multi-threading in nogil setup. So, started with simple benchmark as seen below:

import sys
from concurrent.futures import ThreadPoolExecutor
def fib(n):
    if n < 2: return 1
    return fib(n-1) + fib(n-2)

threads = 8
if len(sys.argv) > 1:
    threads = int(sys.argv[1])

with ThreadPoolExecutor(max_workers=threads) as executor:
    for _ in range(threads):
        executor.submit(lambda: print(fib(34)))

Original Source

Results:

  • python 3.13.a06 with gil:
    htop shows 2 running tasks, and only one core is utilized for 100%
$ time /usr/bin/python3.13 fib.py      
9227465
9227465
9227465
9227465
9227465
9227465
9227465
9227465
/usr/bin/python3.13 fib.py  6,46s user 0,04s system 100% cpu 6,447 total
  • libpython3.13-nogil amd64 3.13.0~a6-1+jammy2 source nogil:
    htop shows 9 running tasks, and 8 cores close to 100% utilization, but slower.
$ time /usr/bin/python3.13-nogil -X gil=0 fib.py
9227465
9227465
9227465
9227465
9227465
9227465
9227465
9227465
/usr/bin/python3.13-nogil -X gil=0 fib.py  168,81s user 0,06s system 788% cpu 21,410 total

My CPU: AMD Ryzen 7 5800X 8-Core Processor
OS: Ubuntu 22.04.4 LTS

So looks like there is overhead when using multiple cores. Is this expected with this version ? Are results similar with Intel & M1 CPUs as well ?

Results documented here in version 3.9.12 on Intel is better.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions