-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harmonize Custom Reductions over nesting levels #802
Comments
I fixed up the Kokkos::Experimental::Max one to also work for inner levels using pretty much the infrastructure in place to support Kokkos::Max (with some ifdefs for the difference between taking pointers vs taking reference to join and init etc.). I tried this code: Kokkos::parallel_reduce(Kokkos::TeamPolicy<>(N/1024,32),
KOKKOS_LAMBDA( const Kokkos::TeamPolicy<>::member_type& team, Scalar& lmax) {
Scalar team_max;
for(int rr = 0; rr<R; rr++) {
int i = team.league_rank();
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,32),
[&] (const int& j, Scalar& thread_max) {
Scalar t_max;
Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team,32),
[&] (const int& k, Scalar& max_) {
if(a((i*32 + j)*32 + k)>lmax) lmax = a((i*32 + j)*32 + k);
},Kokkos::Experimental::Max<Scalar>(t_max));
if(t_max>thread_max) thread_max = t_max;
},Kokkos::Experimental::Max<Scalar>(team_max));
}
if(team_max>lmax) lmax = team_max;
},Kokkos::Experimental::Max<Scalar>(max)); On KNL with N = 1000000 and R = 10000 this takes 5.7s with 256 threads using Kokkos::Max and 4.4s with Kokkos::Experimental::Max for the inner level. |
Need to document new requirements for custom reductions, then change the design. |
Requirements:
|
Common non-summation custom reductions (e.g., product, min, max, and, or) require initialization of thread-local temporary values to an identity that is appropriate for that reduction operator. Some of these identity values are defined in x = reduce( x , identity );
x = reduce( identity , x ); for all possible values of struct Kokkos::reduction_identity<T> {
constexpr static T sum(); // 0
constexpr static T prod(); // 1
constexpr static T max(); // minimum value
constexpr static T min(); // maximum value
constexpr static T bor(); // 0, only for integer type
constexpr static T band(); // !0, only for integer type
}; |
I currently have custom types for reductions, they no longer work. I need to sum up a vector and also take a weighted sum of that vector, which can be done with a single reduction but with 2 doubles returned. I had worked with @crtrott to create the type and they previously worked. Is it now possible to do that with something a If not, I tried implementing the struct above namespace Kokkos {
template<class T>
struct reduction_identity;
template<>
struct reduction_identity<sum_2_numbers> {
KOKKOS_FORCEINLINE_FUNCTION constexpr static sum_2_numbers sum()
{
return static_cast<sum_2_numbers>(sum_2_numbers(0.,0.));
}
KOKKOS_FORCEINLINE_FUNCTION constexpr static sum_2_numbers prod()
{
sum_2_numbers r();
r.sum_one = 1.0;
r.sum_two = 1.0;
return r;
}
KOKKOS_FORCEINLINE_FUNCTION constexpr static sum_2_numbers max()
{
sum_2_numbers r();
r.sum_one = DBL_MIN;
r.sum_two = DBL_MIN;
return r;
}
KOKKOS_FORCEINLINE_FUNCTION constexpr static sum_2_numbers min()
{
sum_2_numbers r();
r.sum_one = DBL_MAX;
r.sum_two = DBL_MAX;
return r;
}
};
} However, I don't know much about I only use the sum operator, so maybe I only need to include the specialization for sum. Even so I get that error is the sum operator. I see 2 potential solutions:
As always, help is greatly appreciated @crtrott @hcedwar and I would be happy to track this in a separate issue if that would be helpful. |
Fix in place, add Current fix: struct sum_2_numbers {
double sum_one;
double sum_two;
KOKKOS_INLINE_FUNCTION
constexpr sum_2_numbers()
: sum_one(0), sum_two(0) { }
...
}; I still think the ability to use something like |
We have currently a number of different ways of doing custom reductions which need some unifying.
The four main things are:
The text was updated successfully, but these errors were encountered: