Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes constexpr errors with frounding-math on gcc < 10. #4278

Merged
merged 7 commits into from
Sep 2, 2021

Conversation

Chrismarsh
Copy link
Contributor

@Chrismarsh Chrismarsh commented Aug 31, 2021

Use std::numeric_limits digits10 instead of bespoke version
trilinos/Trilinos#9056
#4267

@dalg24-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

#define DIGITS10_HELPER_INTEGRAL(TYPE) \
template <> struct digits10_helper<TYPE> { static constexpr int value = digits_helper<TYPE>::value * log10_2; };
template <> struct digits10_helper<TYPE> { static constexpr int value = std::numeric_limits<TYPE>::digits10; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use std::numeric_limits. This is not available in all backends and for all relevant configurations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've fixed that. Was not aware sorry.

…ackends and follow gcc's constexpr-valid approximation of log10(2)
@Chrismarsh
Copy link
Contributor Author

hi @masterleinad, oh ok. I've amended this PR to use
8 * 643L / 2136.

This follows the log10_2 approximation in gcc.

It is not clear to me why the 8bit is baked into the log10_2 constant, so I've just made that more clear

@masterleinad
Copy link
Contributor

It is not clear to me why the 8bit is baked into the log10_2 constant, so I've just made that more clear

That might be a bug. I am seeing

0
16
16
19
36
38
74
77
151
154
151
154
6
15
18

for

std::cout << Kokkos::Experimental::digits10<bool>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<char>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<signed char>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<unsigned char>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<short>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<unsigned short>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<unsigned int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<long int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<unsigned long int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<long long int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<unsigned long long int>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<float>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<double>::value << std::endl;
std::cout << Kokkos::Experimental::digits10<long double>::value << std::endl;

but std::numeric_limits<>::digits10 gives me

0
2
2
2
4
4
9
9
18
19
18
19
6
15
18

see https://godbolt.org/z/K7W4jfx64.

@masterleinad
Copy link
Contributor

The compiler bug is described at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96862. I am wondering whether the workaround here really covers all situations or it just fails for another value. Would you mind checking if you can compile all Kokkos unit tests with -frounding-math using the problematic compiler?

@@ -141,7 +141,7 @@ template <> struct digits_helper<double> { static constexpr int value = DBL_MANT
template <> struct digits_helper<long double> { static constexpr int value = LDBL_MANT_DIG; };
template <class> struct digits10_helper {};
template <> struct digits10_helper<bool> { static constexpr int value = 0; };
constexpr double log10_2 = 2.41;
constexpr double log10_2 = 8*643L/2136; // The fraction 643/2136 approximates log10(2) to 7 significant digits. Follows gcc impl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be

Suggested change
constexpr double log10_2 = 8*643L/2136; // The fraction 643/2136 approximates log10(2) to 7 significant digits. Follows gcc impl
constexpr double log10_2 = 643./2136; // The fraction 643/2136 approximates log10(2) to 7 significant digits. Follows gcc impl

? Note both the multiplication with 8here and making sure the fraction is computed as a floating-point number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so.

a) It appears that any floating point in constexpr under -frounding-math will trigger that bug, so constexpr double log10_2 = 643./2136; puts us back to where we started.

b) The result needs to be rounded down, as having 2.41 digits makes no sense. AFAIK int/int rounds to 0, which is correct and desired. Indeed I think this log10_2 should actually be a constexpr int, not double.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, your code in the godbolt link shows that what you are proposing is equivalent to double log10_2=2 which gives wrong results when compared to std::numeric_limits<>::digits10. I think I would in the end be OK with just hardcoding the values for digits10 instead of computing them at compile-time as workaround.

@dalg24
Copy link
Member

dalg24 commented Aug 31, 2021

For reference https://godbolt.org/z/Kh6v4Ezs3

@dalg24
Copy link
Member

dalg24 commented Aug 31, 2021

OK to test

@Chrismarsh
Copy link
Contributor Author

Chrismarsh commented Sep 1, 2021

Would you mind checking if you can compile all Kokkos unit tests with -frounding-math using the problematic compiler?

$ CXXFLAGS=-frounding-math cmake ../kokkos/ -DCMAKE_INSTALL_PREFIX=~/install-kokkos  -DKokkos_ENABLE_TESTS=On
  [...]

$ make
[...] successfully builds

$ make test
Running tests...
Test project /globalhome/cbm038/HPC/build-kokkos
      Start  1: KokkosCore_UnitTest_Serial1
 1/41 Test  #1: KokkosCore_UnitTest_Serial1 ..................   Passed   16.19 sec
      Start  2: KokkosCore_UnitTest_Serial2
 2/41 Test  #2: KokkosCore_UnitTest_Serial2 ..................   Passed    2.71 sec
      Start  3: KokkosCore_UnitTest_SerialGraph
 3/41 Test  #3: KokkosCore_UnitTest_SerialGraph ..............   Passed    0.01 sec
      Start  4: KokkosCore_UnitTest_Default
 4/41 Test  #4: KokkosCore_UnitTest_Default ..................   Passed    0.02 sec
      Start  5: KokkosCore_UnitTest_PushFinalizeHook
 5/41 Test  #5: KokkosCore_UnitTest_PushFinalizeHook .........   Passed    0.01 sec
      Start  6: KokkosCore_UnitTest_Develop
 6/41 Test  #6: KokkosCore_UnitTest_Develop ..................   Passed    0.01 sec
      Start  7: KokkosCore_UnitTest_LogicalSpaces
 7/41 Test  #7: KokkosCore_UnitTest_LogicalSpaces ............   Passed    0.01 sec
      Start  8: KokkosCore_UnitTest_EventCorrectness
 8/41 Test  #8: KokkosCore_UnitTest_EventCorrectness .........   Passed    0.01 sec
      Start  9: KokkosCore_ProfilingTestLibraryLoadHelp
 9/41 Test  #9: KokkosCore_ProfilingTestLibraryLoadHelp ......   Passed    0.01 sec
      Start 10: KokkosCore_ProfilingTestLibraryCmdLineHelp
10/41 Test #10: KokkosCore_ProfilingTestLibraryCmdLineHelp ...   Passed    0.01 sec
      Start 11: KokkosCore_ProfilingTestLibraryLoad
11/41 Test #11: KokkosCore_ProfilingTestLibraryLoad ..........   Passed    0.01 sec
      Start 12: KokkosCore_ProfilingTestLibraryCmdLine
12/41 Test #12: KokkosCore_ProfilingTestLibraryCmdLine .......   Passed    0.01 sec
      Start 13: KokkosCore_UnitTest_StackTraceTest
13/41 Test #13: KokkosCore_UnitTest_StackTraceTest ...........   Passed    0.02 sec
      Start 14: KokkosCore_UnitTest_DefaultInit_1
14/41 Test #14: KokkosCore_UnitTest_DefaultInit_1 ............   Passed    0.01 sec
      Start 15: KokkosCore_UnitTest_DefaultInit_2
15/41 Test #15: KokkosCore_UnitTest_DefaultInit_2 ............   Passed    0.01 sec
      Start 16: KokkosCore_UnitTest_DefaultInit_3
16/41 Test #16: KokkosCore_UnitTest_DefaultInit_3 ............   Passed    0.01 sec
      Start 17: KokkosCore_UnitTest_DefaultInit_4
17/41 Test #17: KokkosCore_UnitTest_DefaultInit_4 ............   Passed    0.01 sec
      Start 18: KokkosCore_UnitTest_DefaultInit_5
18/41 Test #18: KokkosCore_UnitTest_DefaultInit_5 ............   Passed    0.01 sec
      Start 19: KokkosCore_UnitTest_DefaultInit_6
19/41 Test #19: KokkosCore_UnitTest_DefaultInit_6 ............   Passed    0.01 sec
      Start 20: KokkosCore_UnitTest_DefaultInit_7
20/41 Test #20: KokkosCore_UnitTest_DefaultInit_7 ............   Passed    0.01 sec
      Start 21: KokkosCore_UnitTest_DefaultInit_8
21/41 Test #21: KokkosCore_UnitTest_DefaultInit_8 ............   Passed    0.01 sec
      Start 22: KokkosCore_UnitTest_DefaultInit_9
22/41 Test #22: KokkosCore_UnitTest_DefaultInit_9 ............   Passed    0.01 sec
      Start 23: KokkosCore_UnitTest_DefaultInit_10
23/41 Test #23: KokkosCore_UnitTest_DefaultInit_10 ...........   Passed    0.01 sec
      Start 24: KokkosCore_UnitTest_DefaultInit_11
24/41 Test #24: KokkosCore_UnitTest_DefaultInit_11 ...........   Passed    0.01 sec
      Start 25: KokkosCore_UnitTest_DefaultInit_12
25/41 Test #25: KokkosCore_UnitTest_DefaultInit_12 ...........   Passed    0.01 sec
      Start 26: KokkosCore_UnitTest_DefaultInit_13
26/41 Test #26: KokkosCore_UnitTest_DefaultInit_13 ...........   Passed    0.01 sec
      Start 27: KokkosCore_UnitTest_DefaultInit_14
27/41 Test #27: KokkosCore_UnitTest_DefaultInit_14 ...........   Passed    0.01 sec
      Start 28: KokkosCore_UnitTest_DefaultInit_15
28/41 Test #28: KokkosCore_UnitTest_DefaultInit_15 ...........   Passed    0.01 sec
      Start 29: KokkosCore_UnitTest_DefaultInit_16
29/41 Test #29: KokkosCore_UnitTest_DefaultInit_16 ...........   Passed    0.01 sec
      Start 30: KokkosCore_UnitTest_DefaultInit_17
30/41 Test #30: KokkosCore_UnitTest_DefaultInit_17 ...........   Passed    0.01 sec
      Start 31: KokkosCore_UnitTest_DefaultInit_18
31/41 Test #31: KokkosCore_UnitTest_DefaultInit_18 ...........   Passed    0.01 sec
      Start 32: KokkosCore_IncrementalTest_SERIAL
32/41 Test #32: KokkosCore_IncrementalTest_SERIAL ............   Passed    0.20 sec
      Start 33: KokkosCore_UnitTest_CTestDevice
33/41 Test #33: KokkosCore_UnitTest_CTestDevice ..............   Passed    0.01 sec
      Start 34: KokkosCore_UnitTest_CMakePassCmdLineArgs0
34/41 Test #34: KokkosCore_UnitTest_CMakePassCmdLineArgs0 ....   Passed    0.01 sec
      Start 35: KokkosCore_PerfTestExec
35/41 Test #35: KokkosCore_PerfTestExec ......................   Passed  128.86 sec
      Start 36: KokkosCore_PerformanceTest_Atomic
36/41 Test #36: KokkosCore_PerformanceTest_Atomic ............   Passed    0.05 sec
      Start 37: KokkosCore_PerformanceTest_Atomic_MinMax
37/41 Test #37: KokkosCore_PerformanceTest_Atomic_MinMax .....   Passed    0.20 sec
      Start 38: KokkosCore_PerformanceTest_Mempool
38/41 Test #38: KokkosCore_PerformanceTest_Mempool ...........   Passed    0.01 sec
      Start 39: KokkosCore_PerformanceTest_TaskDag
39/41 Test #39: KokkosCore_PerformanceTest_TaskDag ...........   Passed    0.01 sec
      Start 40: KokkosContainers_UnitTest_Serial
40/41 Test #40: KokkosContainers_UnitTest_Serial .............   Passed   14.95 sec
      Start 41: KokkosAlgorithms_UnitTest
41/41 Test #41: KokkosAlgorithms_UnitTest ....................   Passed   13.98 sec

100% tests passed, 0 tests failed out of 41

Total Test time (real) = 177.50 sec

This is with this PR -- is that what you wanted? Or did it want it pre-PR changes?

@@ -141,7 +141,7 @@ template <> struct digits_helper<double> { static constexpr int value = DBL_MANT
template <> struct digits_helper<long double> { static constexpr int value = LDBL_MANT_DIG; };
template <class> struct digits10_helper {};
template <> struct digits10_helper<bool> { static constexpr int value = 0; };
constexpr double log10_2 = 2.41;
constexpr int log10_2 = 8*643L/2136; // The fraction 643/2136 approximates log10(2) to 7 significant digits. Follows gcc impl
Copy link
Member

@dalg24 dalg24 Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This yields 0 I missed the 8. As Daniel pointed out it yields 2

@masterleinad
Copy link
Contributor

This is with this PR -- is that what you wanted? Or did it want it pre-PR changes?

That should be fine.

@dalg24 dalg24 mentioned this pull request Sep 1, 2021
@crtrott
Copy link
Member

crtrott commented Sep 1, 2021

#4281 supercedes this including a test and it is NOT passing tests on all compilers .

@Chrismarsh
Copy link
Contributor Author

Given this PR is addressing two problems and half of it is, seemingly, quite fiddly and better dealt with in #4281 as a separate PR, I think this should be amended to only address the core/src/Kokkos_Tuners.hpp change. What do you think?

@crtrott
Copy link
Member

crtrott commented Sep 1, 2021

Yeah we could just get the tuner thing through separately.

Co-authored-by: Damien L-G <dalg24+github@gmail.com>
@crtrott crtrott added the Blocks Promotion Overview issue for release-blocking bugs label Sep 1, 2021
@crtrott
Copy link
Member

crtrott commented Sep 1, 2021

ok you can't make it const ... because then the copy constructor and assignment doesn't work (which ignores constexpr stuff I guess).

@Chrismarsh
Copy link
Contributor Author

Yes I'm seeing the same thing on a macos version I don't have locally. I'll remove it.

Chrismarsh added a commit to Chrismarsh/Trilinos that referenced this pull request Sep 2, 2021
@crtrott
Copy link
Member

crtrott commented Sep 2, 2021

Retest this please.

@crtrott crtrott merged commit ae270a1 into kokkos:develop Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocks Promotion Overview issue for release-blocking bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants