-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda: Fix nvcc warnings #7021
Cuda: Fix nvcc warnings #7021
Conversation
Retest this please |
47b41bd
to
9c09b07
Compare
@@ -345,7 +345,7 @@ pipeline { | |||
sh '''rm -rf build && mkdir -p build && cd build && \ | |||
../gnu_generate_makefile.bash \ | |||
--with-options=compiler_warnings \ | |||
--cxxflags="-Werror -Werror all-warnings" \ | |||
--cxxflags="-Werror -Werror all-warnings -Xcudafe --diag_suppress=20208" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to move pragma push into the Complex.hpp
to avoid seeing the warning. At this point, it seems to be better to just disable the warning for this build in the compiler flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does all_warnings
accompany the second -Werror
? I'm really asking for clarification on why -Werror
is needed twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Werror
is for the host compiler and -Werror all-warnings
is for the device compiler.
@@ -186,7 +186,7 @@ class Cuda { | |||
/// | |||
/// This matches the __CUDA_ARCH__ specification. | |||
KOKKOS_DEPRECATED static size_type device_arch() { | |||
const cudaDeviceProp& cudaProp = Cuda().cuda_device_prop(); | |||
const cudaDeviceProp cudaProp = Cuda().cuda_device_prop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What warning was raised here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dangling reference
@@ -4910,7 +4910,7 @@ class NeverThrown { | |||
class GTEST_TEST_CLASS_NAME_(test_suite_name, test_name) \ | |||
: public parent_class { \ | |||
public: \ | |||
GTEST_TEST_CLASS_NAME_(test_suite_name, test_name)() = default; \ | |||
GTEST_TEST_CLASS_NAME_(test_suite_name, test_name)() { (void)test_info_; }\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please reference issues or rejected prs on googletest that show that there is no interest on resolving this upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
google/googletest#4104 (comment) mentions that nvcc
is an unsupported compiler (in a slightly different context).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be a problem if we later want to update gtest ?
What do you think of pre-installing all Kokkos dependencies in the CI so that we use the flags we want solely on Kokkos ? We could then compile gtest with the host compiler ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be a problem if we later want to update gtest ?
We will notice if this is still a problem or not.
What do you think of pre-installing all Kokkos dependencies in the CI so that we use the flags we want solely on Kokkos ? We could then compile gtest with the host compiler ?
This is not about installing gtest
but rather using it. We compiler warnings when we define unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not about installing
gtest
but rather using it. We compiler warnings when we define unit tests.
I had in mind that using an installed gtest would treat gtest include header path as system path thus removing the warnings ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try that (in a separate pull request); it's certainly a bigger change than changing this one line.
Retest this please. |
8e60eb6
to
bc4a6dd
Compare
Retest this please |
The only errors in Cuda builds are
fixed by #7049. |
bc4a6dd
to
2c3fd02
Compare
Rebased after #7049 was merged. |
// Suppress "'long double' is treated as 'double' in device code" | ||
#ifdef KOKKOS_COMPILER_NVCC | ||
#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__ | ||
#pragma nv_diagnostic push | ||
#pragma nv_diag_suppress 20208 | ||
#else | ||
#ifdef __CUDA_ARCH__ | ||
#pragma diagnostic push | ||
#pragma diag_suppress 20208 | ||
#endif | ||
#endif | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that is not the right approach for that test.
We already have all sorts of guards for long double
in these, I feel like we should consider reworking them like we did for the math function (I don;t remember the details but I see that it is not part of this PR meaning we somehow resolved it without suppressing the diagnostic altogether)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just disable all tests for long double
when we compile with Cuda
support (which is what we are doing for other configurations) but these seem to work fine after suppressing the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the long double
tests on CUDA already effectively switched off, in that they are run as double
in CUDA device code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, everything that uses TestNumericTraits
at least, see
kokkos/core/unit_test/TestNumericTraits.hpp
Lines 182 to 204 in 63f0520
#if (defined(KOKKOS_COMPILER_NVCC) && defined(KOKKOS_ENABLE_CUDA)) || \ | |
defined(KOKKOS_ENABLE_SYCL) || defined(KOKKOS_ENABLE_OPENMPTARGET) | |
template <class Tag> | |
struct TestNumericTraits< | |
#if defined(KOKKOS_ENABLE_CUDA) | |
Kokkos::Cuda, | |
#elif defined(KOKKOS_ENABLE_SYCL) | |
Kokkos::Experimental::SYCL, | |
#else | |
Kokkos::Experimental::OpenMPTarget, | |
#endif | |
long double, Tag> { | |
template <class T> | |
using trait = typename Tag::template trait<T>; | |
TestNumericTraits() { | |
(void)take_address_of(trait<long double>::value); | |
// Do nothing on the device. | |
// According to the doc | |
// https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#constexpr-variables | |
// the traits member constant value cannot be directly used in device code. | |
} | |
}; | |
#endif |
Only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fixes look fine, however, questions remain about more (potentially) complicated fixes for warnings associated with methods in TestStdAlgorithmsCommon.hpp
that appear in CUDA-11.2.x compilations -- do we want to fix warnings that will likely disappear with CUDA updates?
@@ -345,7 +345,7 @@ pipeline { | |||
sh '''rm -rf build && mkdir -p build && cd build && \ | |||
../gnu_generate_makefile.bash \ | |||
--with-options=compiler_warnings \ | |||
--cxxflags="-Werror -Werror all-warnings" \ | |||
--cxxflags="-Werror -Werror all-warnings -Xcudafe --diag_suppress=20208" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does all_warnings
accompany the second -Werror
? I'm really asking for clarification on why -Werror
is needed twice.
// Suppress "'long double' is treated as 'double' in device code" | ||
#ifdef KOKKOS_COMPILER_NVCC | ||
#ifdef __NVCC_DIAG_PRAGMA_SUPPORT__ | ||
#pragma nv_diagnostic push | ||
#pragma nv_diag_suppress 20208 | ||
#else | ||
#ifdef __CUDA_ARCH__ | ||
#pragma diagnostic push | ||
#pragma diag_suppress 20208 | ||
#endif | ||
#endif | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the long double
tests on CUDA already effectively switched off, in that they are run as double
in CUDA device code?
The changes here are sufficient for the configurations in the CI. We can discuss other configurations in the original issue or a follow-up. nvcc 11.0 in particular will raise quite a bit more warnings and it's not clear to me if that's worth putting a lot of effort in (which might make code less readable). |
OK, sounds good. I'll see if there are any more "low hanging fruit" fixes to do in the issue. |
Nice fixes :) |
Fixes #6991. Currently, we only turn warnings into errors. We should incorporate all the fixes for the warnings here as well, though.