Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for indirect launch of SYCL parallel reduce #3511

Merged
merged 1 commit into from
Jan 16, 2021

Conversation

nliber
Copy link
Contributor

@nliber nliber commented Oct 22, 2020

These are updates for SYCL parallel reduce.

@masterleinad
Copy link
Contributor

Which of these commits contain the changes you really want? I am happy to help cleaning this up.

@masterleinad
Copy link
Contributor

You will need to fix the indentation.

@masterleinad
Copy link
Contributor

Apparently, the indentation is still not correct. Are you using ./scripts/apply-clang-format?

@masterleinad
Copy link
Contributor

After #3480 has been merged, this should be rebased.

@masterleinad
Copy link
Contributor

Hmmm... There are a lot of unrelated changes and conflicting files here. Which of the commits do you actually care for? (Maybe just rebase onto upstream/develop).

Copy link
Contributor

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mostly fine with this pull request but I am a little concerned about the memory being used after the underlying allocation is freed and whether we can/want to provide some kind of safeguard mechanism.

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
Copy link
Contributor

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks OK to me after rebasing. I leave it up to you to change the return type of memcpy_from to std::unique_ptr<T> or not.

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp Outdated Show resolved Hide resolved
@masterleinad
Copy link
Contributor

This looks good to me but you still need to rebase and fix the formatting.

Copy link
Contributor

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks OK to me. Just some comments. We also need someone else looking at this.

.gitignore Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
Copy link
Member

@crtrott crtrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks largely good. Some questions, and one request - assuming it doesn't have any drawbacks I have not thought about: allocate the memory via memory spaces so the tools can track it.

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Range.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
sycl::malloc(sizeof(*m_result_ptr), q, sycl::usm::alloc::shared));
using ReductionResultMem =
Experimental::Impl::SYCLInternal::ReductionResultMem;
ReductionResultMem& reductionResultMem = instance.m_reductionResultMem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hindsight and wrt to #3671, I am not quite sure if we want to treat the result ptr the same as the kernels. In particular, I am not sure if we really want to store the result in shared rather than device space but I am fine with addressing that later.

@masterleinad
Copy link
Contributor

Hmmm... Looks like something went wrong here. Is this rebased on top of upstream/develop?

@nliber
Copy link
Contributor Author

nliber commented Jan 14, 2021

Hmmm... Looks like something went wrong here. Is this rebased on top of upstream/develop?

Yeah, I don't think that worked. Rebasing again now...

@masterleinad
Copy link
Contributor

Retest this please.

@@ -190,7 +187,7 @@ class Kokkos::Impl::ParallelFor<FunctorType, Kokkos::MDRangePolicy<Traits...>,
return {global_sizes, local_sizes};
}
if constexpr (Policy::rank == 6) {
// id0,id1 encoded within first index; id2,id3 to second index; id4,id5 to
// id0,id1 encoded within first index; id2,id3 to second index; id4,id5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// id0,id1 encoded within first index; id2,id3 to second index; id4,id5
// id0,id1 encoded within first index; id2,id3 to second index; id4,id5 to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@dalg24 dalg24 mentioned this pull request Jan 15, 2021
core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.hpp Show resolved Hide resolved
@masterleinad
Copy link
Contributor

There is still a conflict here.

memory.  Using device memory for it.

Delete redundant ParallelFor

Fix up parallel_for

Because of issues under SYCL+CUDA (possibly with swapping sycl::queue),
made USMObjectMem non-movable and added a .reset method.

Due to suspected issues with sycl::queue assignment under
SYCL+CUDA, inside USMObjectMem made m_q optional
and added appropriate checks: either checking m_q
directly or m_data, because "m_data || !m_q" is a class
invariant

Improved the documentation of USMObjectMem class invariants

Removed extra assert from USMObjectMem::Deleter

Simplified USMObjectMem::reserve

More improvements to USMObjectMem invariant documentation

Removed gdb files from .gitignore

Minor fix to comments

Clarified asserts

WIP: in USMObjectMem, added a fence() function to replace calls to memcopied.wait()
for dealing with sychronous errors reported by exceptions.

Generalized the internal fence() function in USMObjectMem

In USMObjectMem, replaced throw with throw_runtime_exception

Added FIXME_SYCL for future work

Minor changes for P/R 3511
@dalg24 dalg24 dismissed crtrott’s stale review January 15, 2021 22:58

Nevin added a FIXME to that effect

@dalg24 dalg24 merged commit 72f3961 into kokkos:develop Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants