Changes for indirect launch of SYCL parallel reduce #3511

nliber · 2020-10-22T00:49:30Z

These are updates for SYCL parallel reduce.

masterleinad · 2020-10-22T13:28:31Z

Which of these commits contain the changes you really want? I am happy to help cleaning this up.

masterleinad · 2020-10-27T15:05:05Z

You will need to fix the indentation.

masterleinad · 2020-10-27T18:23:14Z

Apparently, the indentation is still not correct. Are you using ./scripts/apply-clang-format?

masterleinad · 2020-10-28T21:52:03Z

After #3480 has been merged, this should be rebased.

masterleinad · 2020-11-17T13:31:17Z

Hmmm... There are a lot of unrelated changes and conflicting files here. Which of the commits do you actually care for? (Maybe just rebase onto upstream/develop).

masterleinad

I am mostly fine with this pull request but I am a little concerned about the memory being used after the underlying allocation is freed and whether we can/want to provide some kind of safeguard mechanism.

core/src/SYCL/Kokkos_SYCL_Instance.hpp

core/unit_test/incremental/Test11a_ParallelFor_TeamThreadRange.hpp

core/unit_test/incremental/Test11b_ParallelFor_TeamVectorRange.hpp

core/unit_test/incremental/Test11c_ParallelFor_ThreadVectorRange.hpp

core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp

masterleinad

Overall this looks OK to me after rebasing. I leave it up to you to change the return type of memcpy_from to std::unique_ptr<T> or not.

core/src/SYCL/Kokkos_SYCL_Instance.hpp

core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp

masterleinad · 2020-12-10T13:58:23Z

This looks good to me but you still need to rebase and fix the formatting.

masterleinad

Still looks OK to me. Just some comments. We also need someone else looking at this.

.gitignore

core/src/SYCL/Kokkos_SYCL_Instance.hpp

core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp

core/src/SYCL/Kokkos_SYCL_Instance.hpp

crtrott

Looks largely good. Some questions, and one request - assuming it doesn't have any drawbacks I have not thought about: allocate the memory via memory spaces so the tools can track it.

core/src/SYCL/Kokkos_SYCL_Instance.hpp

core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp

core/src/SYCL/Kokkos_SYCL_Instance.hpp

masterleinad · 2021-01-11T20:37:13Z

core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp

-        sycl::malloc(sizeof(*m_result_ptr), q, sycl::usm::alloc::shared));
+    using ReductionResultMem =
+        Experimental::Impl::SYCLInternal::ReductionResultMem;
+    ReductionResultMem& reductionResultMem = instance.m_reductionResultMem;


In hindsight and wrt to #3671, I am not quite sure if we want to treat the result ptr the same as the kernels. In particular, I am not sure if we really want to store the result in shared rather than device space but I am fine with addressing that later.

masterleinad · 2021-01-14T00:04:56Z

Hmmm... Looks like something went wrong here. Is this rebased on top of upstream/develop?

nliber · 2021-01-14T00:09:01Z

Hmmm... Looks like something went wrong here. Is this rebased on top of upstream/develop?

Yeah, I don't think that worked. Rebasing again now...

masterleinad · 2021-01-14T20:59:33Z

Retest this please.

masterleinad · 2021-01-14T21:16:32Z

core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp

@@ -190,7 +187,7 @@ class Kokkos::Impl::ParallelFor<FunctorType, Kokkos::MDRangePolicy<Traits...>,
      return {global_sizes, local_sizes};
    }
    if constexpr (Policy::rank == 6) {
-      // id0,id1 encoded within first index; id2,id3 to second index; id4,id5 to
+      // id0,id1 encoded within first index; id2,id3 to second index; id4,id5


Suggested change

// id0,id1 encoded within first index; id2,id3 to second index; id4,id5

// id0,id1 encoded within first index; id2,id3 to second index; id4,id5 to

core/src/SYCL/Kokkos_SYCL_Instance.hpp

masterleinad · 2021-01-15T19:53:48Z

There is still a conflict here.

memory. Using device memory for it. Delete redundant ParallelFor Fix up parallel_for Because of issues under SYCL+CUDA (possibly with swapping sycl::queue), made USMObjectMem non-movable and added a .reset method. Due to suspected issues with sycl::queue assignment under SYCL+CUDA, inside USMObjectMem made m_q optional and added appropriate checks: either checking m_q directly or m_data, because "m_data || !m_q" is a class invariant Improved the documentation of USMObjectMem class invariants Removed extra assert from USMObjectMem::Deleter Simplified USMObjectMem::reserve More improvements to USMObjectMem invariant documentation Removed gdb files from .gitignore Minor fix to comments Clarified asserts WIP: in USMObjectMem, added a fence() function to replace calls to memcopied.wait() for dealing with sychronous errors reported by exceptions. Generalized the internal fence() function in USMObjectMem In USMObjectMem, replaced throw with throw_runtime_exception Added FIXME_SYCL for future work Minor changes for P/R 3511

Nevin added a FIXME to that effect

nliber requested a review from masterleinad October 22, 2020 00:49

masterleinad force-pushed the indirect-launch branch from 27f31e1 to bf368ed Compare October 23, 2020 18:05

nliber force-pushed the indirect-launch branch from 9a4e0c2 to be75ca3 Compare November 20, 2020 02:40

masterleinad reviewed Nov 20, 2020

View reviewed changes

masterleinad reviewed Nov 25, 2020

View reviewed changes

core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp Outdated Show resolved Hide resolved

masterleinad reviewed Dec 8, 2020

View reviewed changes

masterleinad mentioned this pull request Dec 15, 2020

[WIP] SYCL indirect kernel launch #3666

Closed

nliber force-pushed the indirect-launch branch from 9f8f832 to 9e04ea2 Compare December 17, 2020 00:17

masterleinad reviewed Dec 17, 2020

View reviewed changes

crtrott requested changes Dec 18, 2020

View reviewed changes

core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved

core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved

crtrott previously requested changes Dec 18, 2020

View reviewed changes

dalg24 reviewed Dec 18, 2020

View reviewed changes

masterleinad reviewed Jan 11, 2021

View reviewed changes

nliber force-pushed the indirect-launch branch from 19f3b11 to 5868550 Compare January 14, 2021 00:34

masterleinad reviewed Jan 14, 2021

View reviewed changes

masterleinad approved these changes Jan 14, 2021

View reviewed changes

dalg24 mentioned this pull request Jan 15, 2021

SYCL basic TeamPolicy #3654

Merged

dalg24 reviewed Jan 15, 2021

View reviewed changes

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved

core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved

core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved

nliber force-pushed the indirect-launch branch from 5318c77 to 5587569 Compare January 15, 2021 21:09

nliber force-pushed the indirect-launch branch from 5587569 to cad3a89 Compare January 15, 2021 22:24

dalg24 merged commit 72f3961 into kokkos:develop Jan 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes for indirect launch of SYCL parallel reduce #3511

Changes for indirect launch of SYCL parallel reduce #3511

nliber commented Oct 22, 2020

masterleinad commented Oct 22, 2020

masterleinad commented Oct 27, 2020

masterleinad commented Oct 27, 2020

masterleinad commented Oct 28, 2020

masterleinad commented Nov 17, 2020

masterleinad left a comment

masterleinad left a comment

masterleinad commented Dec 10, 2020

masterleinad left a comment

crtrott left a comment

masterleinad Jan 11, 2021

masterleinad commented Jan 14, 2021

nliber commented Jan 14, 2021

masterleinad commented Jan 14, 2021

masterleinad Jan 14, 2021

nliber Jan 15, 2021

masterleinad commented Jan 15, 2021

	// id0,id1 encoded within first index; id2,id3 to second index; id4,id5
	// id0,id1 encoded within first index; id2,id3 to second index; id4,id5 to

Changes for indirect launch of SYCL parallel reduce #3511

Changes for indirect launch of SYCL parallel reduce #3511

Conversation

nliber commented Oct 22, 2020

masterleinad commented Oct 22, 2020

masterleinad commented Oct 27, 2020

masterleinad commented Oct 27, 2020

masterleinad commented Oct 28, 2020

masterleinad commented Nov 17, 2020

masterleinad left a comment

Choose a reason for hiding this comment

masterleinad left a comment

Choose a reason for hiding this comment

masterleinad commented Dec 10, 2020

masterleinad left a comment

Choose a reason for hiding this comment

crtrott left a comment

Choose a reason for hiding this comment

masterleinad Jan 11, 2021

Choose a reason for hiding this comment

masterleinad commented Jan 14, 2021

nliber commented Jan 14, 2021

masterleinad commented Jan 14, 2021

masterleinad Jan 14, 2021

Choose a reason for hiding this comment

nliber Jan 15, 2021

Choose a reason for hiding this comment

masterleinad commented Jan 15, 2021