Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix requests for large team scratch sizes #4728

Merged
merged 10 commits into from
May 16, 2022

Conversation

masterleinad
Copy link
Contributor

Fixes #4715 by converting a bunch of places to take size_t as the type for allocation sizes.

@kyungjoo-kim
Copy link
Contributor

I think that this fix will work fine. The unit test might not fine for all testing platform when the testing platform does not have enough memory as it request very large shared memory which could be amplified by the n_teams. As the test is just for checking that the large number remains as a large number internally in Kokkos (not insane number with overflow), the test can be limited for Kokkos::HostSpace::execution_space with n_team=1.

core/unit_test/TestTeam.hpp Outdated Show resolved Hide resolved
@masterleinad
Copy link
Contributor Author

Retest this please.

@masterleinad
Copy link
Contributor Author

Relies on #4798 for OpenMPTarget.

@masterleinad masterleinad marked this pull request as ready for review February 17, 2022 17:39
core/src/Cuda/Kokkos_Cuda_Team.hpp Outdated Show resolved Hide resolved
core/src/HIP/Kokkos_HIP_Team.hpp Outdated Show resolved Hide resolved
core/src/Kokkos_ExecPolicy.hpp Show resolved Hide resolved
core/src/Cuda/Kokkos_Cuda_Parallel.hpp Outdated Show resolved Hide resolved
core/src/impl/Kokkos_HostThreadTeam.hpp Show resolved Hide resolved
core/unit_test/TestTeamBasic.hpp Outdated Show resolved Hide resolved
core/unit_test/TestTeamBasic.hpp Outdated Show resolved Hide resolved
@masterleinad masterleinad linked an issue Feb 24, 2022 that may be closed by this pull request
@masterleinad masterleinad force-pushed the fix_large_scratch_size branch 5 times, most recently from e7d072a to ac1a116 Compare March 14, 2022 15:07
@@ -895,7 +895,7 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,

const size_t pool_reduce_size = 0; // Never shrinks
const size_t team_reduce_size = TEAM_REDUCE_SIZE * m_policy.team_size();
const size_t team_shared_size = m_shmem_size + m_policy.scratch_size(1);
const size_t team_shared_size = m_shmem_size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the scratch_size(1) removed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_shmem_size already contains m_policy.scratch_size(1),

 inline ParallelFor(const FunctorType& arg_functor, const Policy& arg_policy)
      : m_instance(t_openmp_instance),
        m_functor(arg_functor),
        m_policy(arg_policy),
        m_shmem_size(arg_policy.scratch_size(0) + arg_policy.scratch_size(1) +
                     FunctorTeamShmemSize<FunctorType>::value(
                         arg_functor, arg_policy.team_size())) {}
};

@crtrott crtrott merged commit 2f65ac1 into kokkos:develop May 16, 2022
@crtrott
Copy link
Member

crtrott commented May 17, 2022

Probably the culprit for #5025 , my guess some bug in the OpenMP scratch memory management which does require larger scratch after all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shared memory allocation integer overflow
5 participants