You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, CUDA kernels get passed const structs like LuDecomposeParam which include pointers to arrays of indices. However, the array element types are not const. Explore whether creating something like a ConstLuDecomposeParam struct that has data member pointers to const data, and then having this type be used as the argument to CUDA kernels (with appropriate assignment operator overloads) will result in better pre-fetching of data in memory during the execution of the kernel.
@mattldawson My bad! After adding the const qualifier to all the CUDA kernels, the time of unit tests for CUDA is significantly faster. Thus I think it is helpful to add it here.
Currently, CUDA kernels get passed
const
structs likeLuDecomposeParam
which include pointers to arrays of indices. However, the array element types are notconst
. Explore whether creating something like aConstLuDecomposeParam
struct that has data member pointers toconst
data, and then having this type be used as the argument to CUDA kernels (with appropriate assignment operator overloads) will result in better pre-fetching of data in memory during the execution of the kernel.Possible definition of new struct:
or we can change the existing
LuDecomposeParam
struct with the new one above (without the initializer) and see if it works too.The text was updated successfully, but these errors were encountered: