Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in TestCuda_Other.cpp: most likely assembly inserted into Device code #515

Closed
crtrott opened this issue Oct 28, 2016 · 6 comments
Closed
Assignees
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Milestone

Comments

@crtrott
Copy link
Member

crtrott commented Oct 28, 2016

This is from nightly testing with Cuda 8 on my machine. Reproduce:

generate_makefile.bash --with-cuda --with-openmp --arch=Kepler35,SNB
ptxas <macro util>, line 10; error   : Arguments mismatch for instruction 'mov'
ptxas fatal   : Ptx assembly aborted due to errors
make[2]: *** [TestCuda_Other.o] Error 255
make[2]: *** Waiting for unfinished jobs.

SHA c11e14c is still ok. My guess is that its the memory fence stuff or so in memory_pool

@crtrott crtrott added the Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) label Oct 28, 2016
@crtrott crtrott added this to the Fall 2016 milestone Oct 28, 2016
@crtrott crtrott self-assigned this Oct 28, 2016
@crtrott
Copy link
Member Author

crtrott commented Oct 28, 2016

This is the offending commit:
4105848

@crtrott
Copy link
Member Author

crtrott commented Oct 28, 2016

Reproduce on Kokkos-Dev:

module load sems-gcc/5.3.0
module load kokkos-cuda/8.0.44
generate_makefile.bash --with-cuda --arch=Kepler35,SNB

@nmhamster
Copy link
Contributor

@crtrott this looks like an X86 instruction and it doesn't appear there are any MOV instructions in the commit you have outlined.

@crtrott
Copy link
Member Author

crtrott commented Oct 28, 2016

Yeah I know, this looks to me like a compiler issue ...
Clang for Cuda (with Cuda 8.0 for the ptx translation) as well as Cuda 7.5 are fine.

@crtrott crtrott added the Compiler Issue An issue that Kokkos cannot / should not fix; Kokkos must communicate to relevant vendor label Oct 28, 2016
@crtrott
Copy link
Member Author

crtrott commented Oct 28, 2016

Also if you leave off the SNB from the architecture thing it also works.

@crtrott crtrott removed the Compiler Issue An issue that Kokkos cannot / should not fix; Kokkos must communicate to relevant vendor label Oct 29, 2016
crtrott added a commit that referenced this issue Oct 29, 2016
…evice

Protecting the macros by __CUDA_ARCH__ makes them only come in when
compiling the host phase. This fixes bug #515 which was caused
by load_fence and store_fence introducing ASM code into the device
phase.
@crtrott
Copy link
Member Author

crtrott commented Oct 29, 2016

So this was caused by insufficient protection of assembly code in load_fence and store_fence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

2 participants