-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Open
Labels
performanceMust go fasterMust go faster
Milestone
Description
I have looked into using scoped values for some temporary arrays to avoid allocations in parallel tasks. However, it seems scoped values are allocating when accessed, whereas with tls it can be avoided. This is unfortunate, since gc in parallel tasks can be a performance problem.
using .Threads
using BenchmarkTools
@noinline function tlsfun()
tlsvec = get!(() -> [0], task_local_storage(), :myvec)::Vector{Int}
tlsvec[1] += 1
return nothing
end
const dynvec = ScopedValue([0])
@noinline function dynfun()
dvec = dynvec[]
dvec[1] += 1
return nothing
end
function tlsrun()
@sync for _ in 1:nthreads()
@spawn for _ in 1:100000; tlsfun(); end
end
end
function dynrun()
@sync for _ in 1:nthreads()
@with dynvec=>[0] @spawn for _ in 1:100000; dynfun(); end
end
end
@btime tlsrun()
@btime dynrun()
versioninfo()
output:
2.326 ms (202 allocations: 21.03 KiB)
8.238 ms (2400274 allocations: 36.64 MiB)
Julia Version 1.12.0-DEV.121
Commit bc2212cc0e* (2024-03-04 01:20 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 24 default, 0 interactive, 12 GC (on 24 virtual cores)
Environment:
JULIA_EDITOR = emacs -nw
NHDaly, DatName, Moelf, jw3126, ancapdev and 2 more
Metadata
Metadata
Assignees
Labels
performanceMust go fasterMust go faster