You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
julia> T = CuArray; N =2^4; A =T(Float64.(1:N)); R =T(ones(Float64,N)); CUDA.@time Base.mapreducedim!(identity, +, R, A); R
1.299799 seconds (2.15 M CPU allocations:110.093 MiB, 2.06% gc time)
16-element CuArray{Float64,1}:96.097.098.099.0100.0101.0102.0103.0104.0105.0106.0107.0108.0109.0110.0111.0
we assume that a neutral element is given, but we actually don't have one available.
Passing a explicit init causes R to be updated in place without adding the first value in.
julia> T = CuArray; N =2^4; A =T(Float64.(1:N)); R =T(ones(Float64,N)); CUDA.@time CUDA.GPUArrays.mapreducedim!(identity, +, R, A; init=0.0); R
22.333327 seconds (28.82 M CPU allocations:1.430 GiB, 4.32% gc time), 0.00% GPU gc time
16-element CuArray{Float64,1}:1.02.03.04.05.06.07.08.09.010.011.012.013.014.015.016.0
The in-place mapreducedim APIs are not exported, but even in Base they assume that R is filled with neutral elements (although undocumented). The difference here is that the GPU implementation performs multiple reductions using available values and the neutral ones, hence the resulting value is larger, but even in the CPU case the value is used (hence Base.mapreducedim!(identity, +, ones(Float64,1), ones(Float64,1)) results in 2). The GPU-specific version with init is just there to avoid having to copy R and its neutral elements in the case we need to reduce using multiple steps.
I assume this is because in
CUDA.jl/src/mapreduce.jl
Lines 104 to 108 in b109fda
Passing a explicit
init
causesR
to be updated in place without adding the first value in.cc: @jpsamaroo
The text was updated successfully, but these errors were encountered: