You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tullio was detecting that i is unsafe to multi-thread, but wasn't passing this to KernelAbstractions:
using CUDA, Tullio, KernelAbstractions
let A = CUDA.zeros(4), I =cu([1,3,1]), v =cu([1,2,3])
@tullio A[I[i]] += v[i]
end#+RESULTS:
: 4-element CuArray{Float32,1}:
: 1.0
: 0.0
: 2.0
: 0.0
After my attempt at a fix in 4d30a1f (which runs on the CPU):
julia>let A = CUDA.zeros(4), I =cu([1,3,1]), v =cu([1,2,3])
@tullio A[I[i]] += v[i]
end
ERROR: AssertionError:length(__workgroupsize) <=length(ndrange)
Stacktrace:
[1] partition at /home/user/.julia/packages/KernelAbstractions/jAutM/src/nditeration.jl:103 [inlined]
[2] partition(::KernelAbstractions.Kernel{CUDADevice,KernelAbstractions.NDIteration.StaticSize{(256,)},KernelAbstractions.NDIteration.DynamicSize,var"#gpu_##🇨🇺#253#3"}, ::Tuple{}, ::Nothing) at /home/user/.julia/packages/KernelAbstractions/jAutM/src/KernelAbstractions.jl:385
Besides not running, this is also too cautious, as it would apply to @tullio A[I[i]] = v[i] where it would be more justifiable to over-write -- there is no guaranteed order, but accumulation ought to be right.
@dfdx points out that ScatterNNlib.jl has an approach using atomic_add which may be worth trying to copy:
From here: https://discourse.julialang.org/t/increment-elements-of-array-by-index/49694/5
Tullio was detecting that
i
is unsafe to multi-thread, but wasn't passing this to KernelAbstractions:After my attempt at a fix in 4d30a1f (which runs on the CPU):
Besides not running, this is also too cautious, as it would apply to
@tullio A[I[i]] = v[i]
where it would be more justifiable to over-write -- there is no guaranteed order, but accumulation ought to be right.@dfdx points out that ScatterNNlib.jl has an approach using
atomic_add
which may be worth trying to copy:https://github.com/yuehhua/ScatterNNlib.jl/blob/74622bf40437f16e10468e8a35dae824527c83cf/src/cuda/cuarray.jl#L6-L17
The text was updated successfully, but these errors were encountered: