Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task scheduling regression #54101

Open
sgaure opened this issue Apr 16, 2024 · 1 comment
Open

Task scheduling regression #54101

sgaure opened this issue Apr 16, 2024 · 1 comment
Labels
domain:multithreading Base.Threads and related functionality kind:regression Regression in behavior compared to a previous version
Milestone

Comments

@sgaure
Copy link

sgaure commented Apr 16, 2024

Here is an MWE. It runs varying number of tasks in parallel with the same workload. Every 1000th iteration it yields. Optionally, the tasks can be set to have different priorities (0, 1, ...). On 1.10.2 all runs are equally fast. On master (1.12-DEV), it takes longer with more running threads, unless the tasks have different priority.

using Base.Threads
using InteractiveUtils

const allthreads = 1:nthreads()

function work(n, prio, setprio)
    setprio && (current_task().priority = atomic_add!(prio, UInt16(1)))
    x = pi/4
    for i in 1:n
        x *= 4*(1-x)
        (i & 0x3ff == 0) && yield()
    end
    return x
end

function run(ntasks, setprio)
    prio = Atomic{UInt16}(0)
    @sync for i in 1:ntasks
        @spawn work(10_000_000, prio, setprio)
    end
end

function run(setprio)
    println(setprio ? "Different priority:" : "Same priority:")
    for i in allthreads
        print(i," ")
        @time run(i, setprio)
    end
end

# burn in
run.(allthreads, false)
run.(allthreads, true)

run(true)
run(false)

versioninfo()

Output (master):

Different priority:
1   0.117872 seconds (16 allocations: 976 bytes)
2   0.078606 seconds (22 allocations: 1.469 KiB)
3   0.077994 seconds (28 allocations: 1.984 KiB)
4   0.068771 seconds (34 allocations: 2.500 KiB)
5   0.071072 seconds (40 allocations: 3.016 KiB)
6   0.068591 seconds (46 allocations: 3.531 KiB)
7   0.067906 seconds (52 allocations: 4.047 KiB)
8   0.071773 seconds (58 allocations: 4.562 KiB)
9   0.073936 seconds (65 allocations: 5.438 KiB)
10   0.074904 seconds (71 allocations: 5.953 KiB)
11   0.075124 seconds (77 allocations: 6.469 KiB)
12   0.073555 seconds (83 allocations: 6.984 KiB)
Same priority:
1   0.104835 seconds (16 allocations: 976 bytes)
2   0.103137 seconds (22 allocations: 1.469 KiB)
3   0.148487 seconds (28 allocations: 1.984 KiB)
4   0.156543 seconds (34 allocations: 2.500 KiB)
5   0.169090 seconds (40 allocations: 3.016 KiB)
6   0.216241 seconds (46 allocations: 3.531 KiB)
7   0.259929 seconds (52 allocations: 4.047 KiB)
8   0.293499 seconds (58 allocations: 4.562 KiB)
9   0.316822 seconds (65 allocations: 5.438 KiB)
10   0.338824 seconds (71 allocations: 5.953 KiB)
11   0.385623 seconds (77 allocations: 6.469 KiB)
12   0.406967 seconds (83 allocations: 6.984 KiB)
Julia Version 1.12.0-DEV.317
Commit 0e28cf6abf* (2024-04-08 10:46 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 12 default, 0 interactive, 12 GC (on 24 virtual cores)
Environment:
  JULIA_EXCLUSIVE = 1
  JULIA_NUM_THREADS = auto

Output (1.10.2):

Different priority:
1   0.087411 seconds (16 allocations: 976 bytes)
2   0.069971 seconds (22 allocations: 1.469 KiB)
3   0.063508 seconds (28 allocations: 1.984 KiB)
4   0.059385 seconds (34 allocations: 2.500 KiB)
5   0.057421 seconds (40 allocations: 3.016 KiB)
6   0.068471 seconds (46 allocations: 3.531 KiB)
7   0.062917 seconds (52 allocations: 4.047 KiB)
8   0.062996 seconds (58 allocations: 4.562 KiB)
9   0.062980 seconds (65 allocations: 5.406 KiB)
10   0.065663 seconds (71 allocations: 5.922 KiB)
11   0.074091 seconds (77 allocations: 6.438 KiB)
12   0.069455 seconds (83 allocations: 6.953 KiB)
Same priority:
1   0.066261 seconds (16 allocations: 976 bytes)
2   0.071509 seconds (22 allocations: 1.469 KiB)
3   0.064522 seconds (28 allocations: 1.984 KiB)
4   0.060123 seconds (34 allocations: 2.500 KiB)
5   0.070699 seconds (40 allocations: 3.016 KiB)
6   0.066037 seconds (46 allocations: 3.531 KiB)
7   0.064814 seconds (52 allocations: 4.047 KiB)
8   0.064782 seconds (58 allocations: 4.562 KiB)
9   0.066858 seconds (65 allocations: 5.406 KiB)
10   0.073471 seconds (71 allocations: 5.922 KiB)
11   0.073256 seconds (77 allocations: 6.438 KiB)
12   0.079457 seconds (83 allocations: 6.953 KiB)
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 12 default, 0 interactive, 6 GC (on 24 virtual cores)
Environment:
  JULIA_EXCLUSIVE = 1
  JULIA_NUM_THREADS = auto
@giordano giordano added kind:regression Regression in behavior compared to a previous version domain:multithreading Base.Threads and related functionality labels Apr 16, 2024
@giordano giordano added this to the 1.12 milestone Apr 16, 2024
@maleadt
Copy link
Member

maleadt commented Apr 16, 2024

On 1.10.2 all runs are equally fast.

I think it always creeps up a little bit (in the same-priority case), it's just much more pronounced on master.

Bisected that increment to #50427... cc @gbaraldi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:multithreading Base.Threads and related functionality kind:regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

3 participants