Skip to content

Commit

Permalink
Threads: add missing broadcast to TeamThreadRange parallel_scan (kokk…
Browse files Browse the repository at this point in the history
…os#6601)

* try

* use reference
  • Loading branch information
fnrizzi committed Nov 17, 2023
1 parent 1a14531 commit 8fd8c94
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion core/src/Threads/Kokkos_Threads_Team.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1001,8 +1001,10 @@ KOKKOS_INLINE_FUNCTION void parallel_scan(
lambda(i, scan_val, false);
}

auto & team_member = loop_bounds.thread;

// 'scan_val' output is the exclusive prefix sum
scan_val = loop_bounds.thread.team_scan(scan_val);
scan_val = team_member.team_scan(scan_val);

#ifdef KOKKOS_ENABLE_PRAGMA_IVDEP
#pragma ivdep
Expand All @@ -1012,6 +1014,8 @@ KOKKOS_INLINE_FUNCTION void parallel_scan(
lambda(i, scan_val, true);
}

team_member.team_broadcast(scan_val, team_member.team_size() - 1);

return_val = scan_val;
}

Expand Down

0 comments on commit 8fd8c94

Please sign in to comment.