# RAJA Performance Suite - RAJA_CUDA to RAJA_SYCL

RAJA provides abstractions for parallel loop execution, but does not include memory management.  The companion projects Umpire and CHAI provide portable memory management, but we will use our SYCL USM implementations directly

Below is our SYCL USM implementation for `DAXPY`'s data setup.

In [None]:
%%writefile ./src/RAJAPerf/src/basic/DAXPY_Setup.hpp
x = cl::sycl::malloc_device<Real_type>(iend, qu);
y = cl::sycl::malloc_device<Real_type>(iend, qu);
auto e = qu.memcpy(x, m_x, iend * sizeof(Real_type));
auto e2 = qu.memcpy(y, m_y, iend * sizeof(Real_type));
// Wait for memcpys to finish
e.wait();
e2.wait();


The RAJA Kernel follows a similar structure to other kernel executions.  With a range and index.  The interesting part is in the template parameter, which provides the execution policy for kernel.  The existing RAJA implemenation for running on CUDA is,
```
      RAJA::forall< RAJA::cuda_exec<block_size, true> >(
        RAJA::RangeSegment(ibegin, iend), [=] (Index_type i) {
        DAXPY_BODY;
      });
```

All that is required to port this existing RAJA kernel to run with SYCL is to change the `cuda_exec` policy to a SYCL policy.  Below we use the `sycl_exec_trivial` policy because the `DAXPY` kernel is trivially copyable.

In [None]:
%%writefile ./src/RAJAPerf/src/basic/DAXPY_RAJA_Kernel.hpp

      RAJA::forall< RAJA::sycl_exec_trivial<block_size, true> >(
        RAJA::RangeSegment(ibegin, iend), [=] (Index_type i) {
        DAXPY_BODY;
      });

      

Again we will use our existing SYCL USM implementation to handle the data teardown

In [None]:
%%writefile ./src/RAJAPerf/src/basic/DAXPY_Teardown.hpp
auto e3 = qu.memcpy(m_y, y, iend * sizeof(Real_type));
// Wait for memcpy to finish
e3.wait();
cl::sycl::free(x, qu);
cl::sycl::free(y, qu);


### Now lets rebuild the performance suite

In [1]:
!qsub build_RAJAPerf

UsageError: Cell magic `%%` not found.


### After it finishes building, lets run the performance suite with our `Base_SYCL DAXPY` kernel

In [None]:
!qstat

In [2]:
!qsub -l nodes=1:gpu:ppn=2 run_RAJAPerf

UsageError: Cell magic `%%` not found.


### After the run finishes, lets check the report

In [3]:
!cat output/RAJAPerf-timing.csv

UsageError: Cell magic `%%` not found.
