parallel_launch_local_memory and cuda 7.5 #125

bathmatt · 2015-11-09T21:30:43Z

Getting this error
/home/mbetten/Trilinos/cuda-intrepid-install-opt/include/Cuda/Kokkos_CudaExec.hpp(181):
Error: Formal parameter space overflowed (4096 bytes max) in function ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForI19WeightChargeFunctorNS_10TeamPolicyINS_4CudaEvS5_EEEEEEvT

Christian said
Ok I found it. It is in the new more accurate function to figure out what the best team size etc is. You find it in this file:
kokkos/core/src/Cuda/Kokkos_Cuda_Internal.hpp

If you for now replace all "cuda_parallel_launch_local" with "cuda_parallel_launch_constant" in that file it should work again.

I need to split the functions and make the "Large" check a template parameter, so that not both branches are instantiated for
each functor. Bummer. We also need to add a functor test larger than 4kB to our test suite to catch this the next time.

Christian

Fixes an issue with cuda_get_max_block_size and cuda_get_opt_block_size. This makes the choice of constant vs local memory a template parameter defaulted by the size of the existing DriverType template parameter. It also changes the interface by adding a new shmem_extra argument which is required for lambdas since the functor in those cases doesn't have a shmem size function. Both functions are part of the impl namespace and thus not public yet.

crtrott · 2015-11-12T20:53:49Z

Pushed to master

crtrott added the Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) label Nov 10, 2015

crtrott added the bug - fix pushed to develop branch label Nov 11, 2015

crtrott closed this as completed Nov 12, 2015

crtrott self-assigned this Sep 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel_launch_local_memory and cuda 7.5 #125

parallel_launch_local_memory and cuda 7.5 #125

bathmatt commented Nov 9, 2015

crtrott commented Nov 12, 2015

parallel_launch_local_memory and cuda 7.5 #125

parallel_launch_local_memory and cuda 7.5 #125

Comments

bathmatt commented Nov 9, 2015

crtrott commented Nov 12, 2015