Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel_reduce with RDC crashes linker #232

Closed
ibaned opened this issue Mar 30, 2016 · 3 comments
Closed

parallel_reduce with RDC crashes linker #232

ibaned opened this issue Mar 30, 2016 · 3 comments

Comments

@ibaned
Copy link
Contributor

ibaned commented Mar 30, 2016

Just like the situation described in this commit message: a98c130, I am unable to link any code that uses parallel_reduce with Kokkos and CUDA, if and only if relocatable device code (RDC) is being used. The linker segfaults:

nvcc error   : 'nvlink' died due to signal 11 (Invalid memory reference)

Several standalone example programs and one large library confirm that parallel_reduce+RDC is the problematic combination. Also replicated with many Kokkos commits going back to 2015, unable to find a commit without this behavior.

If this is supposed to work, here is more information to reproduce on Shannon:

../generate_makefile.bash \
--prefix=$HOME/kokkos-install \
--with-cuda=/home/projects/x86-64-sandybridge-nvidia/cuda/7.5.7 \
--arch=Kepler37 \
--with-cuda-options="enable_lambda,rdc" \
--compiler=$HOME/kokkos/config/nvcc_wrapper
#include <Kokkos_Core.hpp>

int main(int argc, char** argv)
{
  Kokkos::initialize(argc, argv);
  int n = 5;
  double sum = 0;
  Kokkos::parallel_reduce(n, KOKKOS_LAMBDA(int i, double& update) {
      update += 42; }, sum);
  Kokkos::finalize();
}
@hcedwar
Copy link
Contributor

hcedwar commented Mar 30, 2016

This is a known bug in nvcc linker for all current versions of CUDA. A bug report has been filed with NVIDIA and a resolution should soon be available.

@ibaned
Copy link
Contributor Author

ibaned commented Mar 30, 2016

I see. Until then, I suppose my best option is to try not to rely on RDC. parallel_reduce is essential to my algorithms, so that has to stay. Thank you.

@hcedwar
Copy link
Contributor

hcedwar commented Apr 6, 2016

This nvcc linker bug is fixed in CUDA 8; which will be publicly available this summer as announced at GPU-Tech conference on April 5, 2016.

@hcedwar hcedwar closed this as completed Apr 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants