core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407

e10harvey · 2023-09-05T14:06:30Z

Related to #6404:

Support mixed precision in operator>, operator>=, operator<=
Do not downcast T for operator<

e10harvey · 2023-09-05T14:07:39Z

@dalg24, @crtrott: Shall I add all mixed precision comparison operators to this PR or would a separate issue and PR be preferred?

aprokop · 2023-09-05T15:09:51Z

Should the tests be expanded to include bfloat16?

e10harvey · 2023-09-05T15:24:11Z

Should the tests be expanded to include bfloat16?

Yes, bfloat16 is covered by these tests.

e10harvey · 2023-09-06T15:37:52Z

retest this please

crtrott · 2023-09-06T17:09:41Z

Does this support half_t vs bhalf_t comparison? Also what about 1<half_t(2) that doesn't work or? I wonder if we should fix that. For example we could have comparisons explicitly for float and double (not as templated functions) and then have a templated one which converts both args to double or to T?

e10harvey · 2023-09-06T17:35:45Z

Does this support half_t vs bhalf_t comparison?

No.

Also what about 1<half_t(2) that doesn't work or? I wonder if we should fix that. For example we could have comparisons explicitly for float and double (not as templated functions) and then have a templated one which converts both args to double or to T?

WRT 1<half_t(2), I can check this.

Shall I add all of these mixed precision comparison operators to this PR or would a separate issue and PR be preferred?

crtrott · 2023-09-06T17:53:46Z

I think it might be better to do it in one go because there could be issues of ambiguity? @dalg24 thoughts?

crtrott · 2023-09-06T18:58:28Z

Ok convinced now separate PRs are better.

crtrott · 2023-09-06T18:59:23Z

Oh but should we have >= <=

dalg24 · 2023-09-07T11:59:47Z

What about operator== and operator!= ?
Make sure your description clearly states that this PR is adding comparison operators when the other operand is float or double.
Also I don't understand

Do not downcast T for operator<`

in the description. What do you mean?

e10harvey · 2023-09-07T13:15:03Z

What about operator== and operator!= ?

#6412

Make sure your description clearly states that this PR is adding comparison operators when the other operand is float or double.

Done.

Also I don't understand
> Do not downcast T for operator<`
in the description.  What do you mean?

See #6404 (comment)

dalg24 · 2023-09-07T14:27:28Z

What about operator== and operator!= ?

#6412

I don't understand your answer

e10harvey · 2023-09-07T14:42:45Z

What about operator== and operator!= ?

#6412

I don't understand your answer

Oh, we agreed to do that in a follow-on issue / PR during the developer meeting yesterday. This PR was just supposed to address the bug report in #6404 although I added >= and <= too. I filed #6412 to address adding the remaining operator overloads.

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

masterleinad · 2023-09-27T16:00:10Z

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

+  template <class T>
+  KOKKOS_FUNCTION friend std::enable_if_t<
+      std::is_same_v<T, float> || std::is_same_v<T, double>, bool>
+  operator<(floating_point_wrapper lhs, T rhs) {
+    return T(lhs) < rhs;
+  }
+
+  template <class T>
+  KOKKOS_FUNCTION friend std::enable_if_t<
+      std::is_same_v<T, float> || std::is_same_v<T, double>, bool>
+  operator<(T lhs, floating_point_wrapper rhs) {
+    return lhs < T(rhs);
+  }


Wouldn't it be better to provide the overload only for double? Types convertible to double should then still work (but don't with the current approach since both overloads would be feasible comparing, e.g., with an int).

Suggested change

template <class T>

KOKKOS_FUNCTION friend std::enable_if_t<

std::is_same_v<T, float> || std::is_same_v<T, double>, bool>

operator<(floating_point_wrapper lhs, T rhs) {

return T(lhs) < rhs;

}

template <class T>

KOKKOS_FUNCTION friend std::enable_if_t<

std::is_same_v<T, float> || std::is_same_v<T, double>, bool>

operator<(T lhs, floating_point_wrapper rhs) {

return lhs < T(rhs);

}

KOKKOS_FUNCTION friend bool

operator<(floating_point_wrapper lhs, double rhs) {

return T(lhs) < rhs;

}

edit: changed float to double.

Note that the SYCL implementation just casts everything to float: https://github.com/intel/llvm/blob/83aafddac12e9b2df8e1764d3fba0ffe7ef6c596/sycl/include/sycl/ext/oneapi/bfloat16.hpp#L176-L198.

Wouldn't it be better to provide the overload only for double? Types convertible to double should then still work (but don't with the current approach since both overloads would be feasible comparing, e.g., with an int

I like the simplicity of this approach. I believe that for other mixed precision expressions involving built-in types T1 op T2, the compiler will implicitly up-cast either T1 or T2. So, providing only the double overload could result in unexpected behavior by up-casting half-precision operands mixed with single-precision operands to double. That is my understanding.

For a comparison with an int, https://godbolt.org/z/efnxf7e5n gives an error:

ambiguous overload for 'operator<' (operand types are 'Foo' and 'int')

If only providing the double or float overload we are getting a similar warning with GCC and an error with clang (if the conversion operator is not explicit).

If we use one templated version, everything works fine, and we can do whatever we want in the comparison operator. This does different casts depending on the types used, see https://cppinsights.io/s/2155d1dd.

If we don't provide the operator, implicit conversion kicks in which means we always cast this type to float and we are then doing comparisons based on the other type used, see https://cppinsights.io/s/1c62be16. It's not quite clear to me where this is failing at the moment.
If implicit conversion isn't sufficient so that we actually need to declare comparison operators, it seems most sensible to me to provide a templated overload that cast the half type to float and let the compiler deal with other casts required.
For backends that define KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH, we can of course, just cast to impl_type instead of float.

With the current approach we are getting

/home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:903:52: error: invalid operands to binary expression ('int' and 'const Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>') actual_lhs(LT_S_H) = static_cast<int>(h_lhs) < h_rhs; ~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~ /home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:377:7: note: in instantiation of member function 'Test::Functor_TestHalfOperators<Kokkos::View<double *, Kokkos::Experimental::SYCL>, Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>::operator()' requested here run_on_host(0); ^ /home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:980:50: note: in instantiation of member function 'Test::Functor_TestHalfOperators<Kokkos::View<double *, Kokkos::Experimental::SYCL>, Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>::Functor_TestHalfOperators' requested here Functor_TestHalfOperators<ViewType, half_type> f_device(h_lhs, h_rhs); ^ /home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:1052:5: note: in instantiation of function template specialization 'Test::__test_half_operators<Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>' requested here __test_half_operators<half_t>(h_lhs + cast_to_half(i + 1), ^ /soft/restricted/CNDA/updates/2023.05.15.001/oneapi/compiler/eng-20230614/compiler/linux/bin-llvm/../include/sycl/half_type.hpp:530:3: note: candidate function not viable: no known conversion from 'const Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>' to 'const half' for 2nd argument OP(<) ^ [...]

(but actual_lhs(LT_H_S) = h_lhs < static_cast<int>(h_rhs); still works).

In short, it's not clear to me how the compiler can figure out that half-precision operations should be used without being explicit here.

Right, we wouldn't use half-precision operations if we cast to float explicitly and then it's a question about correctness vs. performance.
I guess it would be best to forward the operation to the actual half_type in case that it's not just a storage type.

Are you OK with merging this bug-fix and addressing integral types in a follow-on?

I think it's a pretty easy fix to do

KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float>, bool> operator<(const floating_point_wrapper& lhs, const T& rhs) { #ifdef KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH return lhs.val < rhs; #else return static_cast<float>(lhs) < rhs; #endif }

to actually use half-precision operations if possible. If you want to restrict this overload to float and double for now and add testing for int in a later pull request, that's fine with me.

Are you suggesting we remove the member functions for LT,GT,LE, and GE?

Can we? Since we implement both operator<(T, half_type) and operator<half_type, T, I would expect operator<(half_type, half_type) to be ambiguous if not explicitly provided.

masterleinad

Please discuss #6407 (review).

- Check whether T is convertible to float - Try upcasting floating_point_wrapper to float and relying on the toolchains implicit upcasting to kick in - Try comparing impl_type with T if impl_type is a full type

masterleinad · 2023-10-03T18:27:44Z

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

  template <class T>
-  KOKKOS_FUNCTION friend std::enable_if_t<
-      std::is_same_v<T, float> || std::is_same_v<T, double>, bool>
+  KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float> &&
+      (std::is_same_v<T, float> || std::is_same_v<T, double>), bool>
  operator>=(floating_point_wrapper lhs, T rhs) {
    return T(lhs) >= rhs;
  }

  template <class T>
-  KOKKOS_FUNCTION friend std::enable_if_t<
-      std::is_same_v<T, float> || std::is_same_v<T, double>, bool>
+  KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float> &&
+      (std::is_same_v<T, float> || std::is_same_v<T, double>), bool>
  operator>=(T lhs, floating_point_wrapper rhs) {
    return lhs >= T(rhs);


What about these?

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

masterleinad · 2023-10-03T20:31:10Z

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

+#ifdef KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH
+    // CAUTION: this may result in different implicit casting across backends
+    return lhs.val < rhs;
+#else


Looks like for Cuda there is only the operator<(__half, __half) overload.

e10harvey requested review from brian-kelley and nmm0 September 5, 2023 14:06

e10harvey mentioned this pull request Sep 5, 2023

Missing operations for half_t on HIP #6404

Closed

e10harvey changed the title ~~core/src: Add half mixed precision comparison operator>~~ core/src: Add half mixed precision LT, GT comparison operators Sep 5, 2023

e10harvey added the Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) label Sep 5, 2023

e10harvey requested review from crtrott and removed request for nmm0 September 5, 2023 14:49

e10harvey changed the title ~~core/src: Add half mixed precision LT, GT comparison operators~~ core/src: Add half mixed precision LT, GT, LE, GE comparison operators Sep 6, 2023

e10harvey mentioned this pull request Sep 6, 2023

Add support for mixed precision comparison with half #6412

Open

crtrott approved these changes Sep 6, 2023

View reviewed changes

dalg24 requested a review from nliber September 7, 2023 12:00

e10harvey changed the title ~~core/src: Add half mixed precision LT, GT, LE, GE comparison operators~~ core/src: Add half mixed precision (w/ float and double) LT, GT, LE, GE comparison operators Sep 7, 2023

crtrott approved these changes Sep 7, 2023

View reviewed changes

nliber requested changes Sep 7, 2023

View reviewed changes

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp Show resolved Hide resolved

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp Outdated Show resolved Hide resolved

e10harvey mentioned this pull request Sep 26, 2023

Use hidden friends for half operator overloads #6468

Open

e10harvey force-pushed the half_mixed_comp_ops branch from 44cbf48 to d30f9b7 Compare September 26, 2023 13:33

e10harvey changed the title ~~core/src: Add half mixed precision (w/ float and double) LT, GT, LE, GE comparison operators~~ core/src: Add half single and double mixed compare (LT,GT,LE,GE) Sep 26, 2023

e10harvey requested review from nliber and crtrott September 26, 2023 13:34

masterleinad reviewed Sep 27, 2023

View reviewed changes

nliber approved these changes Sep 27, 2023

View reviewed changes

masterleinad requested changes Sep 27, 2023

View reviewed changes

core/src: Add half single and double mixed compare (LT,GT,LE,GE)

c6b6420

e10harvey marked this pull request as draft October 2, 2023 15:16

e10harvey force-pushed the half_mixed_comp_ops branch from d30f9b7 to c6b6420 Compare October 2, 2023 15:20

e10harvey marked this pull request as ready for review October 2, 2023 18:07

e10harvey added 2 commits October 3, 2023 11:55

Implement PR feedback:

9907b73

- Check whether T is convertible to float - Try upcasting floating_point_wrapper to float and relying on the toolchains implicit upcasting to kick in - Try comparing impl_type with T if impl_type is a full type

Add missing endif

8b71e96

masterleinad reviewed Oct 3, 2023

View reviewed changes

Add missing ifdefs

aeba588

masterleinad reviewed Oct 3, 2023

View reviewed changes

core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp Outdated Show resolved Hide resolved

Update core/src/impl/Kokkos_Half_FloatingPointWrapper.hpp

d0b9636

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

masterleinad reviewed Oct 3, 2023

View reviewed changes

Remove HALF_IS_FULL_TYPE branch

1cc775b

masterleinad approved these changes Oct 3, 2023

View reviewed changes

crtrott merged commit 2075ae7 into kokkos:develop Oct 4, 2023
27 of 28 checks passed

masterleinad mentioned this pull request Oct 11, 2023

CHANGELOG: 4.2.0 #6197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407

core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407

e10harvey commented Sep 5, 2023 •

edited

e10harvey commented Sep 5, 2023 •

edited

aprokop commented Sep 5, 2023

e10harvey commented Sep 5, 2023

e10harvey commented Sep 6, 2023

crtrott commented Sep 6, 2023

e10harvey commented Sep 6, 2023 •

edited

crtrott commented Sep 6, 2023

crtrott commented Sep 6, 2023

crtrott commented Sep 6, 2023

dalg24 commented Sep 7, 2023

e10harvey commented Sep 7, 2023

dalg24 commented Sep 7, 2023

e10harvey commented Sep 7, 2023

masterleinad Sep 27, 2023 •

edited

masterleinad Sep 27, 2023

e10harvey Sep 27, 2023 •

edited

masterleinad Sep 27, 2023 •

edited

masterleinad Sep 28, 2023

masterleinad Oct 2, 2023

e10harvey Oct 2, 2023

masterleinad Oct 2, 2023

e10harvey Oct 2, 2023

masterleinad Oct 2, 2023

masterleinad left a comment

masterleinad Oct 3, 2023

e10harvey Oct 3, 2023

masterleinad Oct 3, 2023

e10harvey Oct 3, 2023

core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407

core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407

Conversation

e10harvey commented Sep 5, 2023 • edited

e10harvey commented Sep 5, 2023 • edited

aprokop commented Sep 5, 2023

e10harvey commented Sep 5, 2023

e10harvey commented Sep 6, 2023

crtrott commented Sep 6, 2023

e10harvey commented Sep 6, 2023 • edited

crtrott commented Sep 6, 2023

crtrott commented Sep 6, 2023

crtrott commented Sep 6, 2023

dalg24 commented Sep 7, 2023

e10harvey commented Sep 7, 2023

dalg24 commented Sep 7, 2023

e10harvey commented Sep 7, 2023

masterleinad Sep 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e10harvey Sep 27, 2023 • edited

Choose a reason for hiding this comment

masterleinad Sep 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masterleinad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e10harvey commented Sep 5, 2023 •

edited

e10harvey commented Sep 5, 2023 •

edited

e10harvey commented Sep 6, 2023 •

edited

masterleinad Sep 27, 2023 •

edited

e10harvey Sep 27, 2023 •

edited

masterleinad Sep 27, 2023 •

edited