Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TeamPolicy with reducers with valuetypes without += broken on CUDA #2410

Closed
crtrott opened this issue Oct 1, 2019 · 5 comments
Closed

TeamPolicy with reducers with valuetypes without += broken on CUDA #2410

crtrott opened this issue Oct 1, 2019 · 5 comments
Assignees
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)

Comments

@crtrott
Copy link
Member

crtrott commented Oct 1, 2019

Reported here:
trilinos/Trilinos#6000

I root caused the issue to our TeamPolicy::team_size_max function not taking reducers in addition to the functor and the parallel reduce tag. Hence during the "figure out how many registers this kernel will use" step, it tries to instantiate the kernel as if it were a sum reduction.
Technically that makes that wrong for any reducer other than sum. But for every reducer which has a native value type (like double) it happens to work, though the register count determination might be slightly off - if so it would result in a runtime dispatch error.

Here is a reproducer:

#include<Kokkos_Core.hpp>
#include<cmath>

int main(int argc, char* argv[]) {
  Kokkos::initialize(argc,argv);
  {
    using ExecSpace = Kokkos::DefaultExecutionSpace;
    using ReducerType = Kokkos::MinMax<double, ExecSpace>;
    using ReducerValueType = typename ReducerType::value_type;
    using DynamicScheduleType = Kokkos::Schedule<Kokkos::Dynamic>;
    using TeamPolicyType = Kokkos::TeamPolicy<ExecSpace, DynamicScheduleType>;
    using TeamHandleType = typename TeamPolicyType::member_type;

    static constexpr int num_teams = 1000;
    static constexpr int num_threads = 1000;
    ReducerValueType val;
    ReducerType reducer(val);
    Kokkos::parallel_reduce(
      TeamPolicyType(num_teams, Kokkos::AUTO),
      KOKKOS_LAMBDA(const TeamHandleType& team, ReducerValueType& teamVal) {
    }, reducer);
  }
  Kokkos::finalize();
}

To fix this: the (non-deprecated) max and recommended team size functions need an option to take in a reducer too. There are two options:

  • add an overload team_size_max(FunctorType,ParallelReduceTag,Reducer)
  • make ParallelReduceTag templated on ReducerType

I am thinking option 1 is better.

What needs to happen:

  • add a test which exposes this problem: i.e. test ALL reducers with TeamPolicy too
  • add that overload
  • use it inside the ParallelReduce class to determine max and recommended team size where appropriate
@crtrott crtrott added Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) Blocks Promotion Overview issue for release-blocking bugs labels Oct 1, 2019
@crtrott crtrott added this to the 3.0 Release milestone Oct 1, 2019
@crtrott crtrott added this to To do in Milestone: Release 3.0 via automation Oct 1, 2019
@DavidPoliakoff DavidPoliakoff self-assigned this Oct 1, 2019
@crtrott
Copy link
Member Author

crtrott commented Oct 23, 2019

I think there is a band aid solution where we just hand-code something like 128 instead of calling team_size_max.

crtrott added a commit to crtrott/kokkos that referenced this issue Oct 23, 2019
Tests are missing!
@crtrott
Copy link
Member Author

crtrott commented Oct 23, 2019

OK I put in a PR with a proper fix for CUDA (but the new function overloads were not implemented for the other backends yet - but they can simply drop the reducer and call the overloads without reducers).

@srajama1
Copy link

srajama1 commented Nov 1, 2019

@crtrott @DavidPoliakoff This needs to be patched to Trilinos.

@masterleinad
Copy link
Contributor

@srajama1 See #2499 that pull request will be merged eventually.

@srajama1
Copy link

srajama1 commented Nov 1, 2019

@masterleinad : Thanks !

dalg24 added a commit that referenced this issue Nov 12, 2019
Fix issue #2410 max_team_size not compiling for reducers with scalar types without +=
@crtrott crtrott added bug - fix pushed to develop branch and removed Blocks Promotion Overview issue for release-blocking bugs labels Nov 13, 2019
@crtrott crtrott moved this from In progress to Done in Milestone: Release 3.0 Nov 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
No open projects
Development

No branches or pull requests

5 participants