-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/src: Add half single and double mixed compare (LT,GT,LE,GE) #6407
Conversation
Should the tests be expanded to include bfloat16? |
Yes, bfloat16 is covered by these tests. |
retest this please |
Does this support half_t vs bhalf_t comparison? Also what about |
No.
WRT Shall I add all of these mixed precision comparison operators to this PR or would a separate issue and PR be preferred? |
I think it might be better to do it in one go because there could be issues of ambiguity? @dalg24 thoughts? |
Ok convinced now separate PRs are better. |
Oh but should we have |
|
Done.
See #6404 (comment) |
I don't understand your answer |
Oh, we agreed to do that in a follow-on issue / PR during the developer meeting yesterday. This PR was just supposed to address the bug report in #6404 although I added >= and <= too. I filed #6412 to address adding the remaining operator overloads. |
44cbf48
to
d30f9b7
Compare
template <class T> | ||
KOKKOS_FUNCTION friend std::enable_if_t< | ||
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | ||
operator<(floating_point_wrapper lhs, T rhs) { | ||
return T(lhs) < rhs; | ||
} | ||
|
||
template <class T> | ||
KOKKOS_FUNCTION friend std::enable_if_t< | ||
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | ||
operator<(T lhs, floating_point_wrapper rhs) { | ||
return lhs < T(rhs); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to provide the overload only for double
? Types convertible to double
should then still work (but don't with the current approach since both overloads would be feasible comparing, e.g., with an int
).
template <class T> | |
KOKKOS_FUNCTION friend std::enable_if_t< | |
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | |
operator<(floating_point_wrapper lhs, T rhs) { | |
return T(lhs) < rhs; | |
} | |
template <class T> | |
KOKKOS_FUNCTION friend std::enable_if_t< | |
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | |
operator<(T lhs, floating_point_wrapper rhs) { | |
return lhs < T(rhs); | |
} | |
KOKKOS_FUNCTION friend bool | |
operator<(floating_point_wrapper lhs, double rhs) { | |
return T(lhs) < rhs; | |
} |
edit: changed float
to double
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the SYCL
implementation just casts everything to float
: https://github.com/intel/llvm/blob/83aafddac12e9b2df8e1764d3fba0ffe7ef6c596/sycl/include/sycl/ext/oneapi/bfloat16.hpp#L176-L198.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to provide the overload only for
double
? Types convertible todouble
should then still work (but don't with the current approach since both overloads would be feasible comparing, e.g., with anint
I like the simplicity of this approach. I believe that for other mixed precision expressions involving built-in types T1 op T2
, the compiler will implicitly up-cast either T1
or T2
. So, providing only the double overload could result in unexpected behavior by up-casting half-precision operands mixed with single-precision operands to double. That is my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a comparison with an int
, https://godbolt.org/z/efnxf7e5n gives an error:
ambiguous overload for 'operator<' (operand types are 'Foo' and 'int')
If only providing the double
or float
overload we are getting a similar warning with GCC and an error with clang (if the conversion operator is not explicit).
If we use one templated version, everything works fine, and we can do whatever we want in the comparison operator. This does different casts depending on the types used, see https://cppinsights.io/s/2155d1dd.
If we don't provide the operator, implicit conversion kicks in which means we always cast this type to float and we are then doing comparisons based on the other type used, see https://cppinsights.io/s/1c62be16. It's not quite clear to me where this is failing at the moment.
If implicit conversion isn't sufficient so that we actually need to declare comparison operators, it seems most sensible to me to provide a templated overload that cast the half type to float
and let the compiler deal with other casts required.
For backends that define KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH
, we can of course, just cast to impl_type
instead of float
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current approach we are getting
/home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:903:52: error: invalid operands to binary expression ('int' and 'const Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>')
actual_lhs(LT_S_H) = static_cast<int>(h_lhs) < h_rhs;
~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~
/home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:377:7: note: in instantiation of member function 'Test::Functor_TestHalfOperators<Kokkos::View<double *, Kokkos::Experimental::SYCL>, Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>::operator()' requested here
run_on_host(0);
^
/home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:980:50: note: in instantiation of member function 'Test::Functor_TestHalfOperators<Kokkos::View<double *, Kokkos::Experimental::SYCL>, Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>::Functor_TestHalfOperators' requested here
Functor_TestHalfOperators<ViewType, half_type> f_device(h_lhs, h_rhs);
^
/home/darndt/kokkos/core/unit_test/TestHalfOperators.hpp:1052:5: note: in instantiation of function template specialization 'Test::__test_half_operators<Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>>' requested here
__test_half_operators<half_t>(h_lhs + cast_to_half(i + 1),
^
/soft/restricted/CNDA/updates/2023.05.15.001/oneapi/compiler/eng-20230614/compiler/linux/bin-llvm/../include/sycl/half_type.hpp:530:3: note: candidate function not viable: no known conversion from 'const Kokkos::Experimental::Impl::floating_point_wrapper<sycl::detail::half_impl::half>' to 'const half' for 2nd argument
OP(<)
^
[...]
(but actual_lhs(LT_H_S) = h_lhs < static_cast<int>(h_rhs);
still works).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In short, it's not clear to me how the compiler can figure out that half-precision operations should be used without being explicit here.
Right, we wouldn't use half-precision
operations if we cast to float
explicitly and then it's a question about correctness vs. performance.
I guess it would be best to forward the operation to the actual half_type
in case that it's not just a storage type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you OK with merging this bug-fix and addressing integral types in a follow-on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a pretty easy fix to do
KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float>, bool>
operator<(const floating_point_wrapper& lhs, const T& rhs) {
#ifdef KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH
return lhs.val < rhs;
#else
return static_cast<float>(lhs) < rhs;
#endif
}
to actually use half-precision operations if possible. If you want to restrict this overload to float
and double
for now and add testing for int
in a later pull request, that's fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting we remove the member functions for LT,GT,LE, and GE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we? Since we implement both operator<(T, half_type)
and operator<half_type, T
, I would expect operator<(half_type, half_type)
to be ambiguous if not explicitly provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please discuss #6407 (review).
d30f9b7
to
c6b6420
Compare
- Check whether T is convertible to float - Try upcasting floating_point_wrapper to float and relying on the toolchains implicit upcasting to kick in - Try comparing impl_type with T if impl_type is a full type
template <class T> | ||
KOKKOS_FUNCTION friend std::enable_if_t< | ||
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | ||
KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float> && | ||
(std::is_same_v<T, float> || std::is_same_v<T, double>), bool> | ||
operator>=(floating_point_wrapper lhs, T rhs) { | ||
return T(lhs) >= rhs; | ||
} | ||
|
||
template <class T> | ||
KOKKOS_FUNCTION friend std::enable_if_t< | ||
std::is_same_v<T, float> || std::is_same_v<T, double>, bool> | ||
KOKKOS_FUNCTION friend std::enable_if_t<std::is_convertible_v<T, float> && | ||
(std::is_same_v<T, float> || std::is_same_v<T, double>), bool> | ||
operator>=(T lhs, floating_point_wrapper rhs) { | ||
return lhs >= T(rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
#ifdef KOKKOS_HALF_IS_FULL_TYPE_ON_ARCH | ||
// CAUTION: this may result in different implicit casting across backends | ||
return lhs.val < rhs; | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like for Cuda
there is only the operator<(__half, __half)
overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
Related to #6404:
T
for operator<