-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda build with clang 10 has errors with the atomic unit tests #3237
Comments
I believe the issue is that there is no explicit template <typename T>
KOKKOS_INLINE_FUNCTION T atomic_fetch_or(volatile T* const dest, const T val) {
asm("/* How did we get to here? */"); /* Added so I could track this down */
return Impl::atomic_fetch_oper(Impl::OrOper<T, const T>(), dest, val);
} in Atomic_Generic.hpp. I don't know why clang10 doesn't find this one, or if this is the intended one in this case. |
I believe this is a compiler bug in clang. Here is my pseudo code of how clang filters the overload set for host and device functions: auto matchDeviceType(caller, callee){
if(isHostDevice(callee)){ // HostDevice is always a match
return HostDevice;
}
if(isDevice(caller) && isDevice(callee)){
return SameSide;
}
if((isDevice(caller) && isHost(callee)) || (isHost(caller) && isDevice(callee)){
return WrongSide;
}
}
void filterOverloadSet(candidtates, caller){
anyGoodCalls = false;
for(auto &Cand : candidtates){
if(matchDeviceType(Cand, caller) == SameSide){ // Problem is here hostdevice is also good.
anyGoodCalls = true;
}
} if(anyGoodCalls){ // Since this is false in our case the host func never gets removed
eraseAllWrongSide(candidates);
}
} Where they don't remove When I add |
Hopefully the following gets seen soon. https://bugs.llvm.org/show_bug.cgi?id=46922 I have emailed the author of some of the code I changed. |
I don't think there is an easy way to work around this, we can work around it for our atomics, but the following fails using c++11 #include <initializer_list>
__host__ __device__ int element(std::initializer_list<int> il) {
int result = 0;
for(auto i : il){
result += i;
}
return result;
} Because |
I think this was resolved in Kokkos by PR #3259, marking as InDevelop |
@ndellingwood we worked around it, but the underlying issue is still present. We can probably close this though. |
With clang 10 and newer unit test cuda 1 fails to build:
Errors:
This has been replicated on other systems so I don't think it is arch dependent, but I am on sm61 and cuda 10.1, in case it matters.
The text was updated successfully, but these errors were encountered: