OpenMPTarget: Changes to the hierarchical parallelism #3808

rgayatri23 · 2021-02-24T08:03:56Z

Updates :

Using the num_teams clause to restrict the maximum number of OpenMPTarget teams generated to maximum possible in-flight teams and avoiding the use of atomic_compare_exchange .
Manual worksharing of the "league" loop in hierarchical parallelism to avoid code between teams-distribute and parallel . Gives a 2x speedup on a performance test case.
Idea provided by Christopher Daley from NERSC . @cdaley

…uce implementations.

Rombur

The new code is a lot cleaner but I am not sure it works for every corner case.

Rombur · 2021-02-24T16:27:36Z

core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp

+
+      // Iterate through the number of teams until league_size and assign the
+      // league_id accordingly
+      for (int league_id = blockIdx, track = 0; league_id < league_size;


Does that loop always works? I feel that this works only if league_size < OpenMPTargetExec::MAX_ACTIVE_TEAMS and num_teams = nteams which does the standard does not guarantee. The comment indicates that you want to iterate thought the number of teams but league_size is an upper of that number, right?

@Rombur - The standard guarantees that , the number of teams specified in the num_teams clause is always less than or equal to the maximum concurrently running teams. So I think the loop would work by looping around the gridDim generated. I tried it with a test case where it successfully works when the league_size > max_active_teams. I understand that it might not be hitting the corner case which might break the logic.

Rombur · 2021-02-24T16:28:04Z

core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp

+      const int gridDim  = omp_get_num_teams();
+
+      for (int league_id = blockIdx, track = 0; league_id < league_size;
+           league_id += gridDim, ++track) {


Same comment as above.

dalg24 · 2021-02-25T17:03:03Z

core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp

+          m_functor(team);
+        }
+      } else
+        printf("`num_teams` clause was not respected. \n");


…ts in case for revert.

crtrott

Looks good

Rombur

This looks good. I would remove the old code right now unless you have a reason to keep for now. We can always use git to look at the old implementation.

Rombur · 2021-03-03T14:56:54Z

core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp


-    int* lock_array = OpenMPTargetExec::get_lock_array(max_active_teams);
-
+// Saving the older implementation that uses `atomic_compare_exchange` to


I am not a fan of keeping dead code around. The code is already saved in git.

@Rombur The only reason I have the old code in there is because of a slight apprehension that I might not have covered all corner cases. It would be easier to switch to older code this way. But you are right, we should get rid of it in the future, once all compilers supporting OpenMP offload are mature enough to reach this point.

dalg24 · 2021-03-03T16:53:45Z

core/src/OpenMPTarget/Kokkos_OpenMPTarget_Parallel.hpp

+// Saving the older implementation that uses `atomic_compare_exchange` to
+// calculate the shared memory block index and `distribute` clause to distribute
+// teams.
+#if 0


Please change to

#define KOKKOS_IMPL_USE_NEW_CODE_PATH // chose a more sensible name #ifdef KOKKOS_IMPL_USE_NEW_CODE_PATH <new_code> #else <old_code> #endif ... #ifdef KOKKOS_IMPL_USE_NEW_CODE_PATH <new_code> #else <old_code> #endif #undef KOKKOS_IMPL_USE_NEW_CODE_PATH

…itch between hierarchical implementations.

masterleinad · 2021-03-03T19:13:35Z

Retest this please.

Was addressed

* OpenMPTarget: Changes to the hierarchical ParallelFor and ParallelReduce implementations. * OpenMPTarget: Edited comments for better understanding. * OpenMPTarget: Adding a check for the `num_teams` clause. * OpenMPTarget: Added back the older hierarchical parallelism in comments in case for revert. * OpenMPTarget: Adding `KOKKOS_IMPL_LOCK_FREE_HIERARCHICAL` macro to switch between hierarchical implementations. Co-authored-by: rgayatri <rgayatri@lbl.gov>

OpenMPTarget: Changes to the hierarchical ParallelFor and ParallelRed…

9c39ed0

…uce implementations.

rgayatri23 added the [WIP] label Feb 24, 2021

rgayatri23 requested review from dalg24 and crtrott February 24, 2021 08:03

Rombur reviewed Feb 24, 2021

View reviewed changes

rgayatri added 2 commits February 24, 2021 15:55

OpenMPTarget: Edited comments for better understanding.

d8ae93f

OpenMPTarget: Adding a check for the num_teams clause.

faef129

dalg24 reviewed Feb 25, 2021

View reviewed changes

OpenMPTarget: Added back the older hierarchical parallelism in commen…

9cd528c

…ts in case for revert.

crtrott approved these changes Mar 1, 2021

View reviewed changes

Rombur approved these changes Mar 3, 2021

View reviewed changes

dalg24 previously requested changes Mar 3, 2021

View reviewed changes

OpenMPTarget: Adding KOKKOS_IMPL_LOCK_FREE_HIERARCHICAL macro to sw…

b90bc57

…itch between hierarchical implementations.

crtrott merged commit e17aadb into kokkos:develop Mar 3, 2021

rgayatri23 deleted the OpenMPTarget_optimizations branch March 24, 2023 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenMPTarget: Changes to the hierarchical parallelism #3808

OpenMPTarget: Changes to the hierarchical parallelism #3808

rgayatri23 commented Feb 24, 2021 •

edited

Rombur left a comment

Rombur Feb 24, 2021

rgayatri23 Feb 24, 2021 •

edited

Rombur Feb 24, 2021

dalg24 Feb 25, 2021

crtrott left a comment

Rombur left a comment

Rombur Mar 3, 2021

rgayatri23 Mar 3, 2021

dalg24 Mar 3, 2021 •

edited

masterleinad commented Mar 3, 2021


		int* lock_array = OpenMPTargetExec::get_lock_array(max_active_teams);

		// Saving the older implementation that uses `atomic_compare_exchange` to

OpenMPTarget: Changes to the hierarchical parallelism #3808

OpenMPTarget: Changes to the hierarchical parallelism #3808

Conversation

rgayatri23 commented Feb 24, 2021 • edited

Rombur left a comment

Choose a reason for hiding this comment

Rombur Feb 24, 2021

Choose a reason for hiding this comment

rgayatri23 Feb 24, 2021 • edited

Choose a reason for hiding this comment

Rombur Feb 24, 2021

Choose a reason for hiding this comment

dalg24 Feb 25, 2021

Choose a reason for hiding this comment

crtrott left a comment

Choose a reason for hiding this comment

Rombur left a comment

Choose a reason for hiding this comment

Rombur Mar 3, 2021

Choose a reason for hiding this comment

rgayatri23 Mar 3, 2021

Choose a reason for hiding this comment

dalg24 Mar 3, 2021 • edited

Choose a reason for hiding this comment

masterleinad commented Mar 3, 2021

rgayatri23 commented Feb 24, 2021 •

edited

rgayatri23 Feb 24, 2021 •

edited

dalg24 Mar 3, 2021 •

edited