Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMPTarget hierarchical #3411

Merged
merged 9 commits into from
Oct 14, 2020
Merged

Conversation

rgayatri23
Copy link
Contributor

A newer PR for Hierarchial parallelism in OpenMPTarget backend.
Passes the first 11 incremental tests.
Kokkos league level maps to teams distribute,
Thread level maps to parallel for and
Vector level maps to simd

Reduction only works for "+" operator.

@dalg24
Copy link
Member

dalg24 commented Sep 23, 2020

Add to whitelist

test.run(8, 16);
test.run(11, 13);

// OpenMPTarget backend only accepts >= 32 threads per team
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a multiple of 32 though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it does not need to be a multiple of 32.
Actually I just realized that the clang compiler actually does (32+num_threads-requested-per-team) .
So essentially even if I assign team_size = 32, the actual block generated on the NVIDIA GPU is for 64 threads per team.

@@ -50,6 +50,442 @@

#include <Kokkos_Atomic.hpp>

//----------------------------------------------------------------------------
//----------------------------------------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please say so in the description of the PR or the commit message when you are moving code with no edit.
I had to diff it with the code removed from core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp to find it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry about this, even @crtrott pointed this out. I will be careful next time.

Comment on lines 825 to 831
// Minimum team size should be 32 for OpenMPTarget backend.
if (team_size_request < 32) {
printf(
"OpenMPTarget backend requires a minimum of 32 threads per team.\n");
exit(EXIT_FAILURE);
} else
m_team_size = team_size_request;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bit that matters.

Consider using one of the more standard way to raise error in Kokkos, Impl::throw_runtime_exception or abort

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be abort.

crtrott and others added 6 commits September 29, 2020 21:34
…tests to pass on Gen9. Removed TeamReductionScan test from unit tests.
Merge branch 'OpenMPTarget_Hierarchial' of github.com:rgayatri23/kokkos into OpenMPTarget_Hierarchial
…eam size of 32 for only the OpenMPTarget backend. Use long long int instead of int64_t in incremental test11* to be in sync with the SYCL backend.
…sed in case the team size is less than 32.
@dalg24 dalg24 merged commit c4f78ff into kokkos:develop Oct 14, 2020
@rgayatri23 rgayatri23 deleted the OpenMPTarget_Hierarchial branch March 24, 2023 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants