You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Among our different back-end implementations the type used to accumulate the intermediate values of a scan differs.
This result is inconsistent scan results between different back-ends when using input and output containers with different value types.
For example scanning over an array of 16-bit int inputs and storing into a array of 32-bit int outputs could lead to truncation of the accumulated values to 16-bits despite the output being able to store 32-bits.
The behavior should be made to consistently use the output type for the accumulation type.
Looking into this more closely all the released versions of rocprim don't have a AccType template argument as of rocm 6.1.0. Though this template argument does exist on rocprim develop. So it doesn't look like we can fix this in RAJA until a future version of rocm. @gunney1
Among our different back-end implementations the type used to accumulate the intermediate values of a scan differs.
This result is inconsistent scan results between different back-ends when using input and output containers with different value types.
For example scanning over an array of 16-bit int inputs and storing into a array of 32-bit int outputs could lead to truncation of the accumulated values to 16-bits despite the output being able to store 32-bits.
The behavior should be made to consistently use the output type for the accumulation type.
We default out initial values to the type from the input container.
https://github.com/LLNL/RAJA/blob/develop/include/RAJA/pattern/scan.hpp#L262
We use the type of the output container to accumulate in the sequential backend.
https://github.com/LLNL/RAJA/blob/develop/include/RAJA/policy/sequential/scan.hpp#L146
Cub uses the type of the output container to accumulate for the cuda backend.
https://github.com/dmlc/cub/blob/05eb57faa0a4cac37c2a86fdf4b4dc865a95a1a3/cub/agent/agent_scan.cuh#L110
Rocprim uses the type of the input container to accumulate for the hip backend.
https://github.com/ROCm/rocPRIM/blob/develop/rocprim/include/rocprim/device/device_scan.hpp#L653
The text was updated successfully, but these errors were encountered: