Add half_t and bhalf_t limits #5778

e10harvey · 2023-01-18T18:51:46Z

Move these limits from KokkosKernels. Mostly epsilon is used in unit-tests. ArithTraits also uses the other limits.

Related to kokkos/kokkos-kernels#1414.

cz4rs · 2023-01-18T19:32:51Z

Is this relevant?
llvm/llvm-project@059b823

e10harvey · 2023-01-18T19:34:35Z

Is this relevant? llvm/llvm-project@059b823

It looks relevant. I don't think all toolchains define these constants though.

cz4rs · 2023-01-18T19:36:14Z

Is this relevant? llvm/llvm-project@059b823

It looks relevant. I don't think all toolchains define these constants though.

Specifically, changing MIN_EXP to -13 and MAX_EXP to 16 (as it happens in the linked commit due to C specification).

dalg24

Why define macros rather than provide specialization of the numeric traits?

crtrott · 2023-01-23T20:39:55Z

Yeah: definitely lets just specialize the numeric traits. We need those anyway ...

e10harvey · 2023-01-26T00:18:53Z

Yea, I was thinking about putting them in numeric traits. Let me know if you want the defines somewhere else. I'd prefer to keep the defines.

crtrott

We don't need the macros for anything. Those should go away.
Also the numeric traits specialization for half should be in the half header not in the Kokkos_NumericTraits header.

core/src/Kokkos_Half.hpp

masterleinad · 2023-02-02T14:50:56Z

diff --git a/core/src/Kokkos_Half.hpp b/core/src/Kokkos_Half.hpp
index 656b177cf..59bbac948 100644
--- a/core/src/Kokkos_Half.hpp
+++ b/core/src/Kokkos_Half.hpp
@@ -321,9 +321,7 @@ class alignas(FloatType) floating_point_wrapper {
   KOKKOS_FUNCTION
   floating_point_wrapper(long long rhs) : val(cast_to_wrapper(rhs, val).val) {}
   KOKKOS_FUNCTION
-  constexpr floating_point_wrapper(unsigned short rhs) {
-    reinterpret_cast<unsigned short>(val) = rhs;
-  }
+  constexpr floating_point_wrapper(unsigned short rhs) : val(cast_to_wrapper(rhs, val).val) {}
   KOKKOS_FUNCTION
   floating_point_wrapper(unsigned int rhs)
       : val(cast_to_wrapper(rhs, val).val) {}
@@ -865,7 +863,7 @@ class alignas(FloatType) floating_point_wrapper {
 template <class T>
 static KOKKOS_INLINE_FUNCTION constexpr Kokkos::Experimental::half_t
 cast_to_wrapper(T x, const volatile Kokkos::Impl::half_impl_t::type&) {
-  return Kokkos::Experimental::cast_to_half(x);
+  return Kokkos::Experimental::half_t::impl_type(x);
 }
 
 #ifdef KOKKOS_IMPL_BHALF_TYPE_DEFINED
@@ -1046,7 +1044,7 @@ struct Kokkos::Experimental::Impl::round_error_helper<
 template <>
 struct Kokkos::Experimental::Impl::norm_min_helper<
     Kokkos::Experimental::half_t> {
-  static constexpr int value = 0.000000059604645F;
+  static constexpr Kokkos::Experimental::half_t value = 0.000000059604645F;
 };
 // Quiet not a half precisioin number
 template <>
diff --git a/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp b/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
index d7b0271e1..89a167903 100644
--- a/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
+++ b/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
@@ -29,7 +29,7 @@ namespace Experimental {
 KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(half_t val) { return val; }
 
-KOKKOS_INLINE_FUNCTION
+constexpr KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(float val) { return half_t::impl_type(val); }
 KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(double val) { return half_t::impl_type(val); }

is enough to fix this for SYCL.

This reverts commit 1b399bd.

e10harvey · 2023-02-02T15:46:32Z

diff --git a/core/src/Kokkos_Half.hpp b/core/src/Kokkos_Half.hpp
index 656b177cf..59bbac948 100644
--- a/core/src/Kokkos_Half.hpp
+++ b/core/src/Kokkos_Half.hpp
@@ -321,9 +321,7 @@ class alignas(FloatType) floating_point_wrapper {
   KOKKOS_FUNCTION
   floating_point_wrapper(long long rhs) : val(cast_to_wrapper(rhs, val).val) {}
   KOKKOS_FUNCTION
-  constexpr floating_point_wrapper(unsigned short rhs) {
-    reinterpret_cast<unsigned short>(val) = rhs;
-  }
+  constexpr floating_point_wrapper(unsigned short rhs) : val(cast_to_wrapper(rhs, val).val) {}
   KOKKOS_FUNCTION
   floating_point_wrapper(unsigned int rhs)
       : val(cast_to_wrapper(rhs, val).val) {}
@@ -865,7 +863,7 @@ class alignas(FloatType) floating_point_wrapper {
 template <class T>
 static KOKKOS_INLINE_FUNCTION constexpr Kokkos::Experimental::half_t
 cast_to_wrapper(T x, const volatile Kokkos::Impl::half_impl_t::type&) {
-  return Kokkos::Experimental::cast_to_half(x);
+  return Kokkos::Experimental::half_t::impl_type(x);
 }
 
 #ifdef KOKKOS_IMPL_BHALF_TYPE_DEFINED
@@ -1046,7 +1044,7 @@ struct Kokkos::Experimental::Impl::round_error_helper<
 template <>
 struct Kokkos::Experimental::Impl::norm_min_helper<
     Kokkos::Experimental::half_t> {
-  static constexpr int value = 0.000000059604645F;
+  static constexpr Kokkos::Experimental::half_t value = 0.000000059604645F;
 };
 // Quiet not a half precisioin number
 template <>
diff --git a/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp b/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
index d7b0271e1..89a167903 100644
--- a/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
+++ b/core/src/SYCL/Kokkos_SYCL_Half_Conversion.hpp
@@ -29,7 +29,7 @@ namespace Experimental {
 KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(half_t val) { return val; }
 
-KOKKOS_INLINE_FUNCTION
+constexpr KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(float val) { return half_t::impl_type(val); }
 KOKKOS_INLINE_FUNCTION
 half_t cast_to_half(double val) { return half_t::impl_type(val); }

is enough to fix this for SYCL.

The same changes on CUDA give:

kokkos/core/src/Kokkos_Half.hpp(1043): error: expression must have a constant value

masterleinad · 2023-02-02T17:05:21Z

The same changes on CUDA give:

Yes, I am seeing

/tmp/kokkos/core/src/Kokkos_Half.hpp(1024): error: expression must have a constant value
/tmp/kokkos/core/src/Kokkos_Half.hpp(866): note #2703-D: cannot call non-constexpr function "__half::__half(float)"
/soft/compilers/cuda/cuda-11.6.2/include/cuda_fp16.hpp(201): here
/tmp/kokkos/core/src/Kokkos_Half.hpp(309): note #2693-D: called from:

indicating that none of the conversion functions is constexpr in Cuda.

ajpowelsnl · 2023-02-08T16:50:35Z

@e10harvey - does this issue need some discussion today?

e10harvey · 2023-02-08T17:23:53Z

@e10harvey - does this issue need some discussion today?

@ajpowelsnl - thanks for following up. I think it needs another review by @crtrott and @dalg24 -- please let me know if you have any questions.

masterleinad · 2023-02-08T19:16:25Z

There are failing tests.

masterleinad · 2023-02-08T19:17:35Z

[ RUN      ] defaultdevicetype.bhalf_operators
11: /var/jenkins/workspace/Kokkos/core/unit_test/TestHalfOperators.hpp:926: Failure
11: The difference between f_device_actual_lhs(op_test) and f_device_expected_lhs(op_test) is 0.00020498037338256836, which exceeds epsilon, where
11: f_device_actual_lhs(op_test) evaluates to -0.765625,
11: f_device_expected_lhs(op_test) evaluates to -0.76542001962661743, and
11: epsilon evaluates to 1.1920928955078125e-07.
11: /var/jenkins/workspace/Kokkos/core/unit_test/TestHalfOperators.hpp:926: Failure
11: The difference between f_device_actual_lhs(op_test) and f_device_expected_lhs(op_test) is 3.9145350456237793e-05, which exceeds epsilon, where
11: f_device_actual_lhs(op_test) evaluates to 0.234619140625,
11: f_device_expected_lhs(op_test) evaluates to 0.23457999527454376, and
11: epsilon evaluates to 1.1920928955078125e-07.
11: /var/jenkins/workspace/Kokkos/core/unit_test/TestHalfOperators.hpp:926: Failure
11: The difference between f_device_actual_lhs(op_test) and f_device_expected_lhs(op_test) is 0.00020503997802734375, which exceeds epsilon, where
11: f_device_actual_lhs(op_test) evaluates to 1.234375,
11: f_device_expected_lhs(op_test) evaluates to 1.2345800399780273, and
11: epsilon evaluates to 1.1920928955078125e-07.
11: /var/jenkins/workspace/Kokkos/core/unit_test/TestHalfOperators.hpp:926: Failure
11: The difference between f_device_actual_lhs(op_test) and f_device_expected_lhs(op_test) is 0.00020503997802734375, which exceeds epsilon, where
11: f_device_actual_lhs(op_test) evaluates to 2.234375,
11: f_device_expected_lhs(op_test) evaluates to 2.2345800399780273, and
11: epsilon evaluates to 1.1920928955078125e-07.
11: [  FAILED  ] defaultdevicetype.bhalf_operators (0 ms)

e10harvey · 2023-02-08T19:17:58Z

Hi @mhoemmen ! Is there a way to invoke CUDA half and bhalf casting intrinsics at compile time?

crtrott

Ok I think I get it now.

crtrott · 2023-03-15T16:03:57Z

Uhoh work graph test failed???

crtrott · 2023-03-15T16:04:22Z

I mean workgraph is clearly not an issue coming from this I would expect but this is weird

crtrott · 2023-03-15T16:04:31Z

Retest this please

dalg24 · 2023-03-15T16:22:16Z

How did you come up with these distinguished values?

Ping

e10harvey · 2023-03-16T13:20:37Z

@dalg24, see response in #5778 (comment).

e10harvey · 2023-03-16T13:23:35Z

I mean workgraph is clearly not an issue coming from this I would expect but this is weird

I don't see the workgraph failure. Instead, I see that OPENMPTARGET-Clang failed in 0s. How do I triage this?

dalg24 · 2023-03-16T15:51:43Z

@dalg24, see response in #5778 (comment).

Is there documentation somewhere that you can refer to?

e10harvey · 2023-03-16T15:59:19Z

Is there documentation somewhere that you can refer to?

@dalg24: I'm confused. These values are removed now.

dalg24 · 2023-03-16T16:17:13Z

Is there documentation somewhere that you can refer to?

@dalg24: I'm confused. These values are removed now.

What I am asking is "where would one find out what is the maximum value that is representable with half type". Presumably you looked that up somewhere or you did some experiment. Please document how you came up with these numbers.

e10harvey · 2023-03-16T16:20:32Z

Is there documentation somewhere that you can refer to?

@dalg24: I'm confused. These values are removed now.

What I am asking is "where would one find out what is the maximum value that is representable with half type". Presumably you looked that up somewhere or you did some experiment. Please document how you came up with these numbers.

Gotcha. I will document this. Should this be documented in the Kokkos_Half.hpp file?

dalg24 · 2023-03-16T16:26:25Z

Gotcha. I will document this. Should this be documented in the Kokkos_Half.hpp file?

At the very least here in the discussion but if there is a nice reference yes also add as a comment in code.

e10harvey · 2023-03-20T13:36:05Z

Gotcha. I will document this. Should this be documented in the Kokkos_Half.hpp file?

At the very least here in the discussion but if there is a nice reference yes also add as a comment in code.

OK, thanks for the clarifications, @dalg24. I did not follow a particular reference. I simply looked up definitions of the descriptions for each type of constant, as well as the format of each half precision type (i.e. N bits for significand and M bits for mantissa) and then computed the constants. For the format of the half type I followed https://ieeexplore.ieee.org/abstract/document/9973611. For computing the values, I used https://observablehq.com/@benaubin/floating-point or python, depending on whether I wanted to set individual bits in the half precision type or evaluate an expression. In some cases, I also set the bits via hex constants in the Kokkos half type and used printf to print the base10 representation.

In short, I do not have a single reference for where these values came from but I did my best to verify each constant is correct. Perhaps there is a better reference for these constants, but I did not find one during my half precision development work.

I also documented the human readable description and equation or high bits above each constants in Kokkos_Half.hpp.

I hope this helps.

e10harvey · 2023-03-28T22:10:46Z

@dalg24: If you think a509f58 suffices, I will add similar docs for bhalf_t.

dalg24 · 2023-03-29T12:29:27Z

Retest this please

e10harvey · 2023-03-29T21:02:29Z

The GCC 8.4.0 pipeline stage in jenkins timed out.

e10harvey · 2023-03-29T21:02:48Z

Retest this please

dalg24 · 2023-03-29T22:47:22Z

The GCC 8.4.0 pipeline stage in jenkins timed out.

We are having some issues with the CPU only testing machines. I would have merged w/o that one :/

dalg24 · 2023-03-30T12:58:48Z

Retest this please

dalg24 · 2023-03-30T13:00:16Z

Evan dug out a build that had only GCC 8.4 not run
https://cloud.cees.ornl.gov/jenkins-ci/blue/organizations/jenkins/Kokkos/detail/Kokkos/12521/pipeline/57/
That is good enough for me.

e10harvey requested review from dalg24 and crtrott January 18, 2023 18:51

e10harvey self-assigned this Jan 18, 2023

ajpowelsnl requested a review from lucbv January 18, 2023 19:15

dalg24 previously requested changes Jan 20, 2023

View reviewed changes

weinbe2 mentioned this pull request Jan 25, 2023

Further update to CUDA occupancy calculation #5739

Merged

crtrott requested changes Jan 26, 2023

View reviewed changes

Add half_t and bhalf_t limits

7eb3b93

e10harvey force-pushed the kk_issue1414 branch from 95d5a75 to 7eb3b93 Compare January 26, 2023 15:16

e10harvey requested review from dalg24 and crtrott January 26, 2023 15:16

dalg24 reviewed Jan 26, 2023

View reviewed changes

core/src/Kokkos_Half.hpp Show resolved Hide resolved

dalg24 reviewed Jan 26, 2023

View reviewed changes

core/src/Kokkos_Half.hpp Show resolved Hide resolved

Try using constexpr

1b399bd

Revert "Try using constexpr"

19c73cb

This reverts commit 1b399bd.

e10harvey requested a review from dalg24 February 2, 2023 15:41

Fix norm_min_helper value type

2540e9d

core/unit_test: Add inline comment

ab45335

e10harvey requested review from dalg24, crtrott and masterleinad March 2, 2023 20:35

crtrott approved these changes Mar 15, 2023

View reviewed changes

Add half_t docs

a509f58

dalg24 approved these changes Mar 29, 2023

View reviewed changes

dalg24 merged commit 0476985 into kokkos:develop Mar 30, 2023

e10harvey mentioned this pull request May 17, 2023

Kokkos half precision support #3531

Open

17 tasks

e10harvey mentioned this pull request Sep 13, 2023

CHANGELOG: 4.2.0 #6197

Closed

e10harvey mentioned this pull request Oct 24, 2023

Incorrect half_t and bhalf_t nan and infinity numeric trait values? #6476

Closed

Add half_t and bhalf_t limits #5778

Add half_t and bhalf_t limits #5778

Conversation

e10harvey commented Jan 18, 2023

cz4rs commented Jan 18, 2023

e10harvey commented Jan 18, 2023

cz4rs commented Jan 18, 2023 • edited

dalg24 left a comment • edited

Choose a reason for hiding this comment

crtrott commented Jan 23, 2023

e10harvey commented Jan 26, 2023

crtrott left a comment

Choose a reason for hiding this comment

masterleinad commented Feb 2, 2023

e10harvey commented Feb 2, 2023

masterleinad commented Feb 2, 2023

ajpowelsnl commented Feb 8, 2023

e10harvey commented Feb 8, 2023

masterleinad commented Feb 8, 2023

masterleinad commented Feb 8, 2023

e10harvey commented Feb 8, 2023

crtrott left a comment

Choose a reason for hiding this comment

crtrott commented Mar 15, 2023

crtrott commented Mar 15, 2023

crtrott commented Mar 15, 2023

dalg24 commented Mar 15, 2023

e10harvey commented Mar 16, 2023

e10harvey commented Mar 16, 2023

dalg24 commented Mar 16, 2023

e10harvey commented Mar 16, 2023

dalg24 commented Mar 16, 2023

e10harvey commented Mar 16, 2023

dalg24 commented Mar 16, 2023

e10harvey commented Mar 20, 2023

e10harvey commented Mar 28, 2023

dalg24 commented Mar 29, 2023

e10harvey commented Mar 29, 2023

e10harvey commented Mar 29, 2023

dalg24 commented Mar 29, 2023

dalg24 commented Mar 30, 2023

dalg24 commented Mar 30, 2023

cz4rs commented Jan 18, 2023 •

edited

dalg24 left a comment •

edited