Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency between Range/MDRange/TeamPolicy reduction interface and nested reduction interface #6317

Open
masterleinad opened this issue Jul 27, 2023 · 2 comments · May be fixed by #6921
Open
Assignees
Labels
Enhancement Improve existing capability; will potentially require voting Refactor Tidying up: make code code, cleaner, simpler, understandable and robust

Comments

@masterleinad
Copy link
Contributor

masterleinad commented Jul 27, 2023

As noted in https://kokkosteam.slack.com/archives/C5BGU5NDQ/p1690480258950949?thread_ts=1690468200.131139&cid=C5BGU5NDQ, we currently don't support functors as reducers for nested reductions but for Range/MDRange/TeamPolicy reductiions. We should consider unifying the interfaces so that an example like

#include <Kokkos_Core.hpp>

struct MyValueType {
  int value;
};

struct MyFunctor {
  KOKKOS_FUNCTION void join(MyValueType& left, const MyValueType& right) const {
    left.value += right.value;
  }

  KOKKOS_FUNCTION void init(MyValueType& arg) const { arg.value = 0; }

  KOKKOS_FUNCTION void final(MyValueType& arg) const {}

  KOKKOS_FUNCTION void operator()(const int i, MyValueType& tmp) const {
    tmp.value += i;
  }
};

template <class ExecSpace>
struct Hierarchical_Red_A {
  void run(const int pN, const int sX) {
    using team_policy = Kokkos::TeamPolicy<ExecSpace>;
    using member_type = typename Kokkos::TeamPolicy<ExecSpace>::member_type;

    Kokkos::parallel_for(
        "Team", team_policy(pN, Kokkos::AUTO),
        KOKKOS_LAMBDA(const member_type& team) {
          const int n = team.league_rank();
          MyValueType out{};

          Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team, sX),
                                  MyFunctor{}, out);
        });
  }
};

would work. Changes for the host backends could look like

diff --git a/core/src/impl/Kokkos_HostThreadTeam.hpp b/core/src/impl/Kokkos_HostThreadTeam.hpp
index 94db1f49b..6a2575f3b 100644
--- a/core/src/impl/Kokkos_HostThreadTeam.hpp
+++ b/core/src/impl/Kokkos_HostThreadTeam.hpp
@@ -543,19 +543,21 @@ class HostThreadTeamMember {
     team_reduce(reducer, reducer.reference());
   }
 
-  template <typename ReducerType>
-  KOKKOS_INLINE_FUNCTION std::enable_if_t<is_reducer<ReducerType>::value>
-  team_reduce(ReducerType const& reducer,
-              typename ReducerType::value_type contribution) const noexcept {
-    KOKKOS_IF_ON_HOST((
-        if (1 < m_data.m_team_size) {
-          using value_type = typename ReducerType::value_type;
+  template <typename ReducerType, typename ValueType>
+  KOKKOS_INLINE_FUNCTION void team_reduce(ReducerType const& reducer,
+                                          ValueType& contribution) const
+      noexcept {
+    KOKKOS_IF_ON_HOST((if (1 < m_data.m_team_size) {
+      typename Impl::FunctorAnalysis<
+          Impl::FunctorPatternInterface::REDUCE,
+          TeamPolicy<Kokkos::DefaultHostExecutionSpace>, ReducerType,
+          ValueType>::Reducer wrapped_reducer(reducer);
 
       if (0 != m_data.m_team_rank) {
         // Non-root copies to their local buffer:
         /*reducer.copy( (value_type*) m_data.team_reduce_local()
                     , reducer.data() );*/
-            *((value_type*)m_data.team_reduce_local()) = contribution;
+        *((ValueType*)m_data.team_reduce_local()) = contribution;
       }
 
       // Root does not overwrite shared memory until all threads arrive
@@ -568,25 +570,24 @@ class HostThreadTeamMember {
         //
         // This thread sums contributed values
         for (int i = 1; i < m_data.m_team_size; ++i) {
-              value_type* const src =
-                  (value_type*)m_data.team_member(i)->team_reduce_local();
+          ValueType* const src =
+              (ValueType*)m_data.team_member(i)->team_reduce_local();
 
-              reducer.join(contribution, *src);
+          wrapped_reducer.join(&contribution, src);
         }
 
         // Copy result to root member's buffer:
         // reducer.copy( (value_type*) m_data.team_reduce() , reducer.data()
         // );
-            *((value_type*)m_data.team_reduce()) = contribution;
-            reducer.reference()                  = contribution;
+        *((ValueType*)m_data.team_reduce()) = contribution;
         m_data.team_rendezvous_release();
         // This thread released all other threads from 'team_rendezvous'
         // with a return value of 'false'
       } else {
         // Copy from root member's buffer:
-            reducer.reference() = *((value_type*)m_data.team_reduce());
+        contribution = *((ValueType*)m_data.team_reduce());
       }
-        } else { reducer.reference() = contribution; }))
+    }))
 
     KOKKOS_IF_ON_DEVICE(((void)reducer; (void)contribution;
                          Kokkos::abort("HostThreadTeamMember team_reduce\n");))
@@ -787,17 +788,16 @@ KOKKOS_INLINE_FUNCTION
     parallel_reduce(Impl::TeamThreadRangeBoundariesStruct<iType, Member> const&
                         loop_boundaries,
                     Closure const& closure, ValueType& result) {
-  ValueType val;
+  ValueType val{};
   Sum<ValueType> reducer(val);
-  reducer.init(val);
 
   for (iType i = loop_boundaries.start; i < loop_boundaries.end;
        i += loop_boundaries.increment) {
-    closure(i, reducer.reference());
+    closure(i, val);
   }
 
-  loop_boundaries.thread.team_reduce(reducer);
-  result = reducer.reference();
+  loop_boundaries.thread.team_reduce(closure, val);
+  result = val;
 }
 
 /*template< typename iType, class Space
@ajpowelsnl
Copy link
Contributor

@masterleinad - a discussion topic for the Wed. meeting?

@ajpowelsnl ajpowelsnl added Refactor Tidying up: make code code, cleaner, simpler, understandable and robust Enhancement Improve existing capability; will potentially require voting labels Jul 31, 2023
@masterleinad
Copy link
Contributor Author

@masterleinad - a discussion topic for the Wed. meeting?

The only question to answer is if we actually care about this and eventually assign somebody based on priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improve existing capability; will potentially require voting Refactor Tidying up: make code code, cleaner, simpler, understandable and robust
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants