Adding ExecutionSpace partitioning function #4096

crtrott · 2021-06-15T21:42:21Z

This allows the creation of multiple instances, but only CUDA implements an actual implementation, everyone else returns the same thing multiple times for now.

The design is as a customization point, which would allow for ADL hopefully.

DavidPoliakoff · 2021-06-16T16:51:17Z

core/src/Kokkos_Core.hpp

+//   Customization point for backends
+//   Default behavior is to return the passed in instance
+template <class ExecSpace, class... Args>
+std::array<ExecSpace, sizeof...(Args)> partition_space(ExecSpace space,


Are we intentionally requiring people to partition into a compile-time sized set of args? Should we have a variant where this takes a vector of weights and returns a vector of instances?

Hm interesting question, we could have both, that said I didn't see a real need for that much runtime choice.

The partition count in EMPIRE will be runtime-determined

That said, it could just call the current unweighted CUDA implementation that just makes separate streams one by one, but that's probably not the usage you were going for.

done, using std::vector now

core/src/Kokkos_Cuda.hpp

core/src/Cuda/Kokkos_Cuda_Instance.cpp

PhilMiller · 2021-06-16T17:59:33Z

core/unit_test/TestExecSpacePartitioning.hpp

+void check_equalness(ExecSpace, ExecSpace) {}
+
+#ifdef KOKKOS_ENABLE_CUDA
+void check_equalness(Kokkos::Cuda exec1, Kokkos::Cuda exec2) {


Perhaps check_distinctive?

PhilMiller · 2021-06-16T18:04:53Z

core/unit_test/TestExecSpacePartitioning.hpp

+      sum1);
+  Kokkos::parallel_reduce(
+      Kokkos::RangePolicy<TEST_EXECSPACE>(instances[1], 0, N), SumFunctor(),
+      sum2);


This poses the same testing challenge as I noted in #4059 - confirming that various kernels actually ran on distinct execution spaces, and that they were at least hypothetically concurrent.

the distinctive part is probably good enough for now

crtrott · 2021-06-16T18:20:51Z

Allow runtime size via some vector type.
Return always std::vector.
No normalization requirement.
Support float and int, but make sure its either, not a mix?

This allows the creation of multiple instances, only CUDA implements an actual implementation, everyone else returns the same thing multiple times for now.

crtrott · 2021-07-01T22:07:21Z

I addressed all the points: for the int/float thing I allow mix, I am using a fold expression to static_assert that they all are, that fold expresssion is protected by feature test macro.

crtrott · 2021-07-01T22:37:08Z

@PhilMiller @DavidPoliakoff I think i addressed everything

PhilMiller · 2021-07-02T12:38:23Z

For runtime-determined partition count, I think the partition functions need to take std::vector rather than variadic arguments, too?

DavidPoliakoff · 2021-07-02T13:16:33Z

@PhilMiller yeah, I believe we discussed this in the meeting. Should either be a vector, or a template<template typename> thingy, and I recommend vector

masterleinad · 2021-07-06T18:22:05Z

core/src/Kokkos_Cuda.hpp

+template <class... Args>
+std::vector<Cuda> partition_space(Cuda space, Args...) {
+  std::vector<Cuda> instances(sizeof...(Args));
+#ifdef __cpp_fold_expressions
+  static_assert(
+      (... && std::is_arithmetic_v<Args>),
+      "Kokkos Error: partitioning arguments must be integers or floats");
+#endif
+  for (int s = 0; s < int(sizeof...(Args)); s++) {
+    cudaStream_t stream;
+    CUDA_SAFE_CALL(cudaStreamCreate(&stream));
+    instances[s] = Cuda(stream, true);
+  }
+  return instances;
+}


/var/jenkins/workspace/Kokkos/core/src/Kokkos_Cuda.hpp:260:40: error: unused parameter 'space' [clang-diagnostic-unused-parameter] std::vector<Cuda> partition_space(Cuda space, Args...) { ^ /var/jenkins/workspace/Kokkos/core/src/Kokkos_Cuda.hpp:269:5: error: use of undeclared identifier 'CUDA_SAFE_CALL' [clang-diagnostic-error] CUDA_SAFE_CALL(cudaStreamCreate(&stream)); ^

Also moved the CUda overloads to the Instances header file.

PhilMiller · 2021-07-16T21:21:46Z

I think the variadic version only needs to be implemented once, in the generic code, since it can then call the version taking a vector<T> weights and let the overload resolution happen there.

PhilMiller

This will suite EMPIRE's needs well enough. The interpretation of the weights will become an interesting question down the line, but it'll be good to get this step integrated

DavidPoliakoff

I'm very lightly opposed to having a vector and a variadic overload. If anybody else wants to back this up, I'll change it to a request changes. But I think this is a good implementation of what we're aiming at

dalg24 · 2021-07-20T12:06:59Z

core/src/HIP/Kokkos_HIP_Instance.hpp

+}
+
+template <class T>
+std::vector<HIP> partition_space(const HIP &, std::vector<T> &weights) {


Why are weights by non-const reference?

dalg24 · 2021-07-20T12:12:00Z

core/unit_test/TestExecSpacePartitioning.hpp

+};
+
+template <class ExecSpace>
+void check_distinctive(ExecSpace, ExecSpace) {}


You mean distinct from one another?

dalg24 · 2021-07-20T12:21:20Z

core/unit_test/TestExecSpacePartitioning.hpp

+
+TEST(TEST_CATEGORY, partitioning_by_args) {
+  auto instances =
+      Kokkos::Experimental::partition_space(TEST_EXECSPACE(), 1, 1.);


Did we capture anywhere the rational for being able to mix types?

crtrott force-pushed the partition_space branch from 76288b2 to 811212e Compare June 15, 2021 23:06

DavidPoliakoff reviewed Jun 16, 2021

View reviewed changes

PhilMiller reviewed Jun 16, 2021

View reviewed changes

core/src/Kokkos_Cuda.hpp Outdated Show resolved Hide resolved

PhilMiller reviewed Jun 16, 2021

View reviewed changes

core/src/Cuda/Kokkos_Cuda_Instance.cpp Outdated Show resolved Hide resolved

PhilMiller reviewed Jun 16, 2021

View reviewed changes

This was referenced Jun 16, 2021

Implementation Strawman: Generic splitting of ExecSpace instances #3901

Closed

Implementation Sketch: Portable Stream creation/destruction #3892

Closed

crtrott added 2 commits July 1, 2021 16:05

Adding ExecutionSpace partitioning function

8e4c0c6

This allows the creation of multiple instances, only CUDA implements an actual implementation, everyone else returns the same thing multiple times for now.

Partitioning: use std::vector, address review, add hip

afa5343

crtrott force-pushed the partition_space branch from 3602985 to afa5343 Compare July 1, 2021 22:06

masterleinad reviewed Jul 6, 2021

View reviewed changes

crtrott added this to In progress in Kokkos Release 3.5 Jul 14, 2021

Merge branch 'develop' into partition_space

d798b21

crtrott force-pushed the partition_space branch from 1489655 to 77bca56 Compare July 15, 2021 23:26

crtrott moved this from In progress to Awaiting Feedback in Kokkos Release 3.5 Jul 15, 2021

crtrott force-pushed the partition_space branch 3 times, most recently from 62c6a23 to 62ef2f3 Compare July 16, 2021 20:24

Exec Space Instances: added vector based overload

85f8782

Also moved the CUda overloads to the Instances header file.

crtrott force-pushed the partition_space branch from 62ef2f3 to 85f8782 Compare July 16, 2021 20:57

PhilMiller approved these changes Jul 16, 2021

View reviewed changes

DavidPoliakoff approved these changes Jul 19, 2021

View reviewed changes

crtrott merged commit ab87061 into kokkos:develop Jul 19, 2021

Kokkos Release 3.5 automation moved this from Awaiting Feedback to Done Jul 19, 2021

dalg24 reviewed Jul 20, 2021

View reviewed changes

masterleinad mentioned this pull request Feb 4, 2022

Request: Kokkos helper for creating execution-space containing stream? #3895

Closed

masterleinad mentioned this pull request May 31, 2023

Allow passing a temporary std::vector to partition_space #6167

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding ExecutionSpace partitioning function #4096

Adding ExecutionSpace partitioning function #4096

crtrott commented Jun 15, 2021

DavidPoliakoff Jun 16, 2021

crtrott Jun 16, 2021

PhilMiller Jun 16, 2021

PhilMiller Jun 16, 2021

crtrott Jul 1, 2021

PhilMiller Jun 16, 2021

crtrott Jul 1, 2021

PhilMiller Jun 16, 2021

crtrott Jul 1, 2021

crtrott commented Jun 16, 2021

crtrott commented Jul 1, 2021

crtrott commented Jul 1, 2021

PhilMiller commented Jul 2, 2021

DavidPoliakoff commented Jul 2, 2021

masterleinad Jul 6, 2021

PhilMiller commented Jul 16, 2021

PhilMiller left a comment

DavidPoliakoff left a comment

dalg24 Jul 20, 2021

dalg24 Jul 20, 2021

dalg24 Jul 20, 2021

Adding ExecutionSpace partitioning function #4096

Adding ExecutionSpace partitioning function #4096

Conversation

crtrott commented Jun 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crtrott commented Jun 16, 2021

crtrott commented Jul 1, 2021

crtrott commented Jul 1, 2021

PhilMiller commented Jul 2, 2021

DavidPoliakoff commented Jul 2, 2021

Choose a reason for hiding this comment

PhilMiller commented Jul 16, 2021

PhilMiller left a comment

Choose a reason for hiding this comment

DavidPoliakoff left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment