Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base.mapreducedim returns wrong answer with non-zero target array #720

Closed
vchuravy opened this issue Feb 18, 2021 · 2 comments
Closed

Base.mapreducedim returns wrong answer with non-zero target array #720

vchuravy opened this issue Feb 18, 2021 · 2 comments

Comments

@vchuravy
Copy link
Member

julia> T = CuArray; N = 2^4; A = T(Float64.(1:N)); R = T(ones(Float64,N)); CUDA.@time Base.mapreducedim!(identity, +, R, A); R
  1.299799 seconds (2.15 M CPU allocations: 110.093 MiB, 2.06% gc time)
16-element CuArray{Float64,1}:
  96.0
  97.0
  98.0
  99.0
 100.0
 101.0
 102.0
 103.0
 104.0
 105.0
 106.0
 107.0
 108.0
 109.0
 110.0
 111.0

I assume this is because in

CUDA.jl/src/mapreduce.jl

Lines 104 to 108 in b109fda

neutral = if neutral === nothing
R[Iout]
else
neutral
end
we assume that a neutral element is given, but we actually don't have one available.

Passing a explicit init causes R to be updated in place without adding the first value in.

julia> T = CuArray; N = 2^4; A = T(Float64.(1:N)); R = T(ones(Float64,N)); CUDA.@time CUDA.GPUArrays.mapreducedim!(identity, +, R, A; init=0.0); R
 22.333327 seconds (28.82 M CPU allocations: 1.430 GiB, 4.32% gc time), 0.00% GPU gc time
16-element CuArray{Float64,1}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0
 11.0
 12.0
 13.0
 14.0
 15.0
 16.0

cc: @jpsamaroo

@vchuravy vchuravy added the bug Something isn't working label Feb 18, 2021
@maleadt
Copy link
Member

maleadt commented Feb 19, 2021

The in-place mapreducedim APIs are not exported, but even in Base they assume that R is filled with neutral elements (although undocumented). The difference here is that the GPU implementation performs multiple reductions using available values and the neutral ones, hence the resulting value is larger, but even in the CPU case the value is used (hence Base.mapreducedim!(identity, +, ones(Float64,1), ones(Float64,1)) results in 2). The GPU-specific version with init is just there to avoid having to copy R and its neutral elements in the case we need to reduce using multiple steps.

@maleadt maleadt removed the bug Something isn't working label Feb 19, 2021
@vchuravy
Copy link
Member Author

ah typical. source of errors sits 90cm in front of my monitor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants