-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Rebase) Partial fix to compile time issues w/nvcc + Kokkos_ENABLE_DEBUG_BOUNDS_CHECK #7013
Conversation
…eUnitTest_Default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't pass Windows CUDA testing (doesn't like noreturn on something)
Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need
kokkos/core/src/Kokkos_Abort.hpp
Lines 39 to 46 in 835dbf5
#if defined(KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK) | |
// required to workaround failures in random number generator unit tests with | |
// pre-volta architectures | |
#define KOKKOS_IMPL_ABORT_NORETURN | |
#else | |
// cuda_abort aborts when building for other platforms than macOS | |
#define KOKKOS_IMPL_ABORT_NORETURN [[noreturn]] | |
#endif |
then?
Can/should we handle the [[noreturn]]
part elsewhere?
Isn't this fix mostly about using the __noinline__
specifier?
core/src/Cuda/Kokkos_Cuda_abort.hpp
Outdated
extern __device__ void __assertfail(const void *message, const void *file, | ||
unsigned int line, const void *function, | ||
size_t charsize); | ||
__device__ [[noreturn]] void __assertfail(const void *message, const void *file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I can see how the extern
specifier didn't make much sense but, out of curiosity, is that change necessary (not asking to change anything at this time).
Also, for consistency I would have slightly preferred if the [[noreturn]]
attribute came first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if the extern change was necessary -- but looks like clang cuda also chokes on the [[noreturn]]
not being first so I moved it
Retest this please |
Clang CUDA doesn't work:
|
@crtrott let me see if reversing the order of |
@dalg24 interestingly I get |
Also, just tested and (maybe unsurprisingly) changing this doesn't change compilation time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are errors with illegal instructions. Is that real?
I suspect maybe they are... I'm looking at it this week |
Retest this please |
Rebase of #6596
For reference when configured with:
This (rebased) PR builds on my machine with -j 16 in 20 mins. Current develop takes 70 mins.
I have tested this on a Pascal arch machine, all tests pass (except for the 1gb memory test since I don't have that on my machine. I specifically tested the random number tests 100 times in a row with no failure.