-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang compilation of CUDA backend on Windows #3345
Conversation
Can one of the admins verify this patch? |
@@ -55,6 +55,8 @@ | |||
#include <cerrno> | |||
#ifndef _WIN32 | |||
#include <unistd.h> | |||
#else | |||
#include <Windows.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed because msvc includes it by default but msvc-cl doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed because clang-cl also includes it, but pure clang++ doesn't.
@@ -381,7 +381,7 @@ KOKKOS_INTERNAL_INLINE_DEVICE_IF_CUDA_ARCH bool _atomic_compare_exchange_strong( | |||
T* dest, T compare, T val, MemoryOrderSuccess, MemoryOrderFailure, | |||
typename std::enable_if< | |||
(sizeof(T) == 1 || sizeof(T) == 2 || sizeof(T) == 4 || | |||
sizeof(T) == 8) && | |||
sizeof(T) == 8 || sizeof(T) == 16) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work also with msvc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow seem it doesn't. This was my attempt of a clean fix for #3344.
This is the msvc (clang-cl) error:
2>C:\Code\kokkos\core\src\impl/Kokkos_Atomic_Generic.hpp(210,16): error : no matching function for call to 'atomic_compare_exchange'
2>C:\Code\kokkos\core\src\impl/Kokkos_Atomic_Generic.hpp(556,16): message : in instantiation of function template specialization 'Kokkos::Impl::atomic_fetch_oper<Kokkos::Impl::AddOper<long long, const long long>, long long>' requested here
2>C:\Code\kokkos\core\src\impl/Kokkos_Atomic_Increment.hpp(132,11): message : in instantiation of function template specialization 'Kokkos::atomic_fetch_add<long long>' requested here
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(154,12): message : candidate function not viable: no known conversion from 'unsigned long long *' to 'volatile int *const' for 1st argument
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(162,13): message : candidate function not viable: no known conversion from 'unsigned long long *' to 'volatile long *const' for 1st argument
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(174,21): message : candidate function not viable: no known conversion from 'unsigned long long *' to 'volatile unsigned int *const' for 1st argument
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(180,22): message : candidate function not viable: no known conversion from 'unsigned long long *' to 'volatile unsigned long *const' for 1st argument
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(189,10): message : candidate template ignored: requirement 'sizeof(unsigned long long) == sizeof(int)' was not satisfied [with T = unsigned long long]
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(208,10): message : candidate template ignored: requirement 'sizeof(unsigned long long) != sizeof(int) && sizeof(unsigned long long) == sizeof(long)' was not satisfied [with T = unsigned long long]
2>C:\Code\kokkos\core\src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp(253,10): message : candidate template ignored: requirement '(sizeof(unsigned long long) != 4) && (sizeof(unsigned long long) != 8)' was not satisfied [with T = unsigned long long]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're okay with this - I'll implement a define to distinguish between clang++
and clang-cl,
because msvc
gets activated when _MSC_VER
is found.
#if defined(_MSC_VER) && !defined(KOKKOS_COMPILER_INTEL)
#define KOKKOS_COMPILER_MSVC _MSC_VER
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to fix this using c9535e4, see below.
OK to test. |
This looks generally good. I will test on my machine too on windows. Thanks for this! |
You will also need to fix the indentation using
|
We can apply the format patch if you like, and its clang 8.0.0 (yeah it is that specific for clang-format ...). |
Sure, please do.
I was in doubt (and forgot to push) a change to include |
As the new commit summary writes, it solves the preprocessor clashing between |
Please look at I applied clang-format on the two commits and edited the author name from I tried to push directly to your branch but the permissions changed and it was rejected. |
@dalg24 I have no idea what happened (long time haven't used Windows)... |
Either grant me permission to push to your branch and I will fix it or force push my branch to yours. |
Retest this please |
1 similar comment
Retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issues see comments.
cmake/kokkos_enable_devices.cmake
Outdated
@@ -25,6 +25,16 @@ IF (KOKKOS_ENABLE_PTHREAD) | |||
SET(KOKKOS_ENABLE_THREADS ON) | |||
ENDIF() | |||
|
|||
# as CMAKE_CXX_SIMULATE_ID does not work, detect clang-cl to avoid clang++, | |||
# cl, and clang-cl clashes, requires CMake >= 3.15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work we have lower cmake version as minimum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this in mind. For a lower-version cmake, it will default to clang-cl. Users that specifically use clang++ need CMake >= 3.15.
#link omp library from LLVM lib dir | ||
IF(KOKKOS_COMPILER_CLANG_MSVC) | ||
#for clang-cl expression /openmp yields an error, so directly add the specific Clang flag | ||
SET(ClangOpenMPFlag /clang:-fopenmp=libomp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This got added in the Travis build now, and thus fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see the error, will change it. Thanks.
* add --cuda-path clang option * add KOKKOS_COMPILER_CLANG_MSVC if using clang-cl * Clang CUDA compilation needs GNU atomics activated * fixed clashing of Windows and GNU atomic functions declarations
* add --cuda-path clang option * add KOKKOS_COMPILER_CLANG_MSVC if using clang-cl * Clang CUDA compilation needs GNU atomics activated * fixed clashing of Windows and GNU atomic functions declarations
I fixed the CUDA path issue in CMake on a branch which also rebases on top of #3532: (my branch is on crtrott/windows-clang-with-cuda). But it turns out that compared to 3532 this broke the nvcc windows build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks the existing nvcc build on windows. Question: do you want me to force push a rebase to this branch, or shall I open a new PR with the rebase of this branch on #3532 and then we continue there?
I think I figured it out. https://github.com/crtrott/kokkos/tree/windows-clang-with-cuda |
Also @j8asic can you check my branch and see whether your build still works as expected? All our builds now work (nvcc on linux, clang for cuda on linux, nvcc in VS, msvc in VS, clang-cl in VS). |
I force pushed my rebase onto this. |
…nto windows-clang-with-cuda
Checked out https://github.com/crtrott/kokkos/tree/windows-clang-with-cuda
I usually comment the include line to continue...
Don't know why nvcc is detected as MSVC... |
Hi, what do you mean with "general in Windows" the appveyor test on Windows did pass, so at least a simple MSVC build seems to work. Regarding NVCC: you need some setup fancy. See here: #3063 (comment) Note you have to update all the paths according to your install. And those paths change every time you update Visual Studio ... |
My Second point is that
Finally, Do you want me to fix CMake errors and push it to my branch? |
Yeah try pushing the fixes to cmake and we can recheck if everything works. |
Note: the travis failure is probably an unrelated timeout. |
@j8asic if you still want this in for the next release we need to do this soon. We are starting release process next week. |
Thanks @j8asic . I will check this again with our configs, and if it all passes we can have that in the next release |
@crtrott you're welcome. It works, but for example I'm not sure what happens when TPL_INCLUDES is a list. Btw I'm not fixed to a release, and am completely demotivated, since I wasted a month trying to deploy on Windows and lost users:
|
oh man that is a bummer. These compiler issues are really a drain on all of us. |
Passed with MSVC, NVCC+MSVC and MSVC-Clang on my machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable.
As
clang-cl
is does not enableCUDA
compilation (/clang:--cuda***
options do not work), the code changes here enableclang++
to compile Kokkos. So one can simply useCMake
withninja
and build the project. Most of the errors were due to clashing of preprocessor defines (see also #3344).Changes:
-x cu
for Clang, as it already uses-x cuda
__thread
instead of__declspec(thread)
, see here_atomic_compare_exchange_strong
to acceptsizeof(T) == 16
(needed when using a pair of int64 as arg)Windows.h
forGetCurrentProcessId
inKokkos_Core.cpp