-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SYCL TeamPolicy nested parallel_reduce #3783
Conversation
8d9bc1c
to
0a27d7e
Compare
73828a5
to
030fc63
Compare
This should work now. |
ababc58
to
421957c
Compare
@dalg24 I addressed your comments. |
What is the plan for relaxing the requirement for the vector length to be 1? |
My current plan is to have all the functionality working for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments regarding team_size/vector_lemgth folding in one dim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me now.
7c093ce
to
12dde12
Compare
Retest this please. |
@@ -493,6 +584,7 @@ KOKKOS_INLINE_FUNCTION void parallel_for( | |||
const Impl::ThreadVectorRangeBoundariesStruct<iType, Impl::SYCLTeamMember>& | |||
loop_boundaries, | |||
const Closure& closure) { | |||
// FIXME_SYC: adapt for vector_length!=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be FIXME_SYCL (so we don't miss it later).
Based on top of #3763, this pull request implements TeamPolicy parallel_reduce. Currently, vector_length=1 is assumed.