# Unified Share Memory Example

The cell below creates a program that demonstrates how to perform a simple computation on a GPU using unified shared memomry (USM) in SYCL. It initializes an array with values, performs an element-wise multiplication on the GPU, and prints the result. This code highlights the use of `malloc_shared` to allocate memory that both the host and device can access, showcasing how data can be managed efficiently between the CPU and GPU.

In [None]:
%%writefile compute.cpp
#include <sycl/sycl.hpp>
#include <iostream>

int main(){
    //# select device for offload
    sycl::queue q(sycl::gpu_selector_v);
    std::cout << "Offload Device: " << q.get_device().get_info<sycl::info::device::name>() << "\n";

    //# initialize some data array
    const int N = 16;
    auto data = sycl::malloc_shared<float>(N, q);
    for(int i=0;i<N;i++) data[i] = i;

    //# computation on GPU
    q.single_task([=](){
        for(int i=0;i<N;i++) data[i] = data[i] * 5;
    }).wait();

    //# print output
    for(int i=0;i<N;i++) std::cout << data[i] << "\n"; 
}

The cell below creates a shell script that sets the SYCL environment, compiles the SYCL code with the DPC++ compiler, and runs the executable if the compilation is successful.

In [None]:
%%writefile ./run-dot.sh
#!/bin/bash 
source /opt/intel/oneapi/setvars.sh > /dev/null 2>&1
icpx -fsycl compute.cpp
if [ $? -eq 0 ]; then ./a.out; fi

The final cell executes a command that makes the shell script executable and runs it, executing the SYCL program that performs computations on the GPU and outputs the results.

In [None]:
!chmod u+x ./run-dot.sh &&./run-dot.sh