You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new reduction interface should integrate with kernel through the current kernel_param interface. Reduce arguments will be passed in and the appropriate lambda arguments can be generated in a similar way to how they are generated in the forall interface:
Kernel's statement::Lambda allows arguments be populated implicitly or explicitly depending on how you define the statement::Lambda type. In the implicit case we need to populate lambda objects with all arguments as required by the elements of the kernel_param tuple regardless of the use in the lambda body:
data_t worksum;
using EXEC_POL_I =
RAJA::statement::ForICount<1, RAJA::statement::Param<0>, RAJA::loop_exec,
RAJA::statement::ForICount<0, RAJA::statement::Param<1>, RAJA::loop_exec,
RAJA::statement::Lambda<0>
>
RAJA::statement::ForICount<0, RAJA::statement::Param<1>, RAJA::loop_exec,
RAJA::statement::ForICount<1, RAJA::statement::Param<0>, RAJA::loop_exec,
RAJA::statement::Lambda<1>
>
>;
RAJA::kernel_param<SEQ_EXEC_POL_I>(
RAJA::make_tuple((int)0,
(int)0,
Tile_Array,
RAJA::expt::Reduce<RAJA::operator::add>(&worksum)),
[=](int col, int row, int tx, int ty, TILE_MEM &Tile_Array, Index_type&, data_t& m_worksum)
{ ... }, // This lambda does reduction work
[=](int col, int row, int tx, int ty, TILE_MEM &Tile_Array, Index_type&, data_t& m_worksum)
{ ... } // This lambda does NOT do reduction work.
);
RAJA::Kernel Also allows for explicit argument definitions within a statement::Lambda type:
data_t worksum = 0;
using EXEC_POL =
RAJA::statement::For<1, RAJA::loop_exec,
RAJA::statement::For<0, RAJA::loop_exec,
RAJA::statement::Lambda<0, Segs<0>, Segs<1>, Offsets<0>, Offsets<1>, Params<0>, Params<1> >
>
>
RAJA::statement::For<0, RAJA::loop_exec,
RAJA::statement::For<1, RAJA::loop_exec,
RAJA::statement::Lambda<1, Segs<0, 1>, Offsets<0, 1>, Params<0> >
>
>
RAJA::kernel_param<EXEC_POL>(
RAJA::make_tuple(Tile_Array,
RAJA::expt::Reduce<RAJA::operator::add>(&worksum)),
[=](int col, int row, int tx, int ty, TILE_MEM &Tile_Array, data_t& m_worksum) {
...
},
[=](int col, int row, int tx, int ty, TILE_MEM &Tile_Array) {
...
}
);
The text was updated successfully, but these errors were encountered:
Hey @mdavis36, in the implicit lambda case, are there typos where data_t m_red ought to be data_t & worksum? If so, is this implying that we need to pass the reduced data to each lambda, regardless of whether that lambda actually performs a reduction?
@rchen20 Updated the example above, the lambda argument itself is m_worksum, the target for the final reduction result is worksum. These should be different. m_worksum is the thread local value to be used before the actual reduction work is done later.
@mdavis36 if I'm reading the above would this essentially collapse all the various different reduction types (e.g. `RAJA::ReduceSum<RAJA::seq_reduce, int> RAJA::ReduceSum<RAJA::omp_reduce_ordered, int> RAJA::ReduceSum<RAJA::cuda_reduce, int>, etc...) down to one single type ? So, you would only need 1 data type for all your different execution policies?
If so I'd just like to say that I'd be very much for such a feature as forall loops of mine that have those operations are the only ones I can't abstract away to a single forall abstraction using something like raja::expt::dynamic_forall feature for all the execution policies I support in my libraries/apps (cpu, openmp, cuda, hip, etc...).
Unfortunately, things like std::variant or std::visit still are not supported on the device, at least to my current knowledge of things, which would have allowed a simple-ish solution to the above.
The new reduction interface should integrate with kernel through the current
kernel_param
interface. Reduce arguments will be passed in and the appropriate lambda arguments can be generated in a similar way to how they are generated in theforall
interface:Kernel's
statement::Lambda
allows arguments be populated implicitly or explicitly depending on how you define thestatement::Lambda
type. In the implicit case we need to populate lambda objects with all arguments as required by the elements of thekernel_param
tuple regardless of the use in the lambda body:RAJA::Kernel
Also allows for explicit argument definitions within astatement::Lambda
type:The text was updated successfully, but these errors were encountered: