# Using HIP on Setonix

```bash
salloc --account ${PAWSEY_PROJECT} --ntasks 1 --mem 4GB --cpus-per-task 1 --time 1:00:00 --gpus 1 --partition work
```

```bash
salloc --account ${PAWSEY_PROJECT} --ntasks 1 --mem 4GB --cpus-per-task 1 --time 1:00:00 --gpus-per-node 1 --gpus-per-task 1 --partition work
```

```bash
salloc --account ${PAWSEY_PROJECT} --ntasks 1 --mem 4GB --cpus-per-task 1 --time 1:00:00 --gpus-per-task 2 --partition gpu
```

```bash
module load rocm
```

```bash
CC -ggdb -fPIC -I../include -I/opt/rocm/hip/include -I/opt/rocm/hsa/include -I/opt/rocm/llvm/lib/clang/14.0.0/include  mat_elementwise.cpp -o mat_elementwise.exe
```

```bash
CC -ggdb -std=c++11 -D_HIP_ROCclr__ --offload-arch=gfx90a -D__HIP_ARCH_GFX90a=1 --rocm-path=/opt/rocm -L /opt/rocm/hip/lib -lamdhip64 -x hip -I../include mat_elementwise.cpp -o mat_elementwise.exe
```

```bash
module load craype-accel-amd-gfx90a
```

This worked

```bash
module load PrgEnv-cray 
```

This seemed to work

```bash
CC -std=c++11 -D_HIP_ROCclr__ --offload-arch=gfx90a -D__HIP_ARCH_GFX90a=1 --rocm-path=/opt/rocm -L /opt/rocm/hip/lib -lamdhip64 -x hip -I../include mat_elementwise.cpp -o mat_elementwise.exe
```

In [None]:
AMDGPU Thread agent-id:queue-id:dispatch-num:wave-id (work-group-z,work-group-y,work-group-x)/
work-group-thread-index

## Exercise: compiling your first HIP application

At the location [hello_devices.cpp](hello_devices.cpp) is a complete HIP application to obtain the size of on-device memory and the maximum Buffer size that is possible within that memory. 

* **Step 1.** From the Jupyter launcher start a Terminal and use cd to navigate to the src/L1_Introduction directory in the course material

```bash
cd src/L1_Introduction
```

* **Step 2.** You need to know where the HIP ICD loader and HIP header files are located. For this particular example the locations are as follows:

| File | Directory |
| :--- | :--- |
| ICD loader (libHIP.so) | /usr/lib/x86_64-linux-gnu |
| HIP C++ headers directory (CL) | /usr/include |


In the Terminal use **ls** to list the contents of these directories and locate the **CL** directory in which the HIP header files are located. 

* **Step 3.** Compile the application source file **hello_devices.cpp** using the **g++** compiler. The compilation command should look like this:

```bash
g++ -g -O2 -I/usr/include -I../include -L/usr/lib/x86_64-linux-gnu hello_devices.cpp\
    -o hello_devices.exe -lHIP
```

On Linux you can add the location of the **CL** directory to your **CPATH** environment variable, and the location of **libHIP.so** to both your **LIBRARY** and **LD_LIBRARY_PATH** environment variables. Then you won't need to explicity tell the compiler where the HIP resources are.

```bash
g++ -g -O2 -I../include hello_devices.cpp -o hello_devices.exe -lHIP
```

* **Step 4.** Now run the application

```bash
./hello_devices.exe
```

You should see at least one device printed with the name and memory sizes. Now that you know how to let the compiler know about HIP you can use the **make** command within that directory to compile the example. 

In [1]:
!make clean; make

rm -r *.exe
g++ -std=c++11 -g -O2 -fopenmp -I/usr/include -I../include -L/usr/lib64 hello_devices.cpp\
	-o hello_devices.exe -lOpenCL -lomp
In file included from [01m[Khello_devices.cpp:2:0[m[K:
[01m[K../include/cl_helper.hpp:[m[K In function ‘[01m[K_cl_command_queue** h_create_command_queues(_cl_device_id**, _cl_context**, cl_uint, cl_uint, cl_bool, cl_bool)[m[K’:
         [01;35m[K)[m[K;
         [01;35m[K^[m[K
In file included from [01m[K/usr/include/CL/opencl.h:24:0[m[K,
                 from [01m[K../include/cl_helper.hpp:15[m[K,
                 from [01m[Khello_devices.cpp:2[m[K:
[01m[K/usr/include/CL/cl.h:1906:1:[m[K [01;36m[Knote: [m[Kdeclared here
 [01;36m[KclCreateCommandQueue[m[K(cl_context                     context,
 [01;36m[K^~~~~~~~~~~~~~~~~~~~[m[K


This application is rather rudimentary, however there is a far more sophisticated HIP query application called **clinfo**. You can use it to query a great deal on information on the available devices. Here we use clinfo to query available platforms and devices.

In [1]:
!clinfo -l

Platform #0: Intel(R) FPGA Emulation Platform for OpenCL(TM)
 `-- Device #0: Intel(R) FPGA Emulation Device
Platform #1: Intel(R) OpenCL
 `-- Device #0: AMD EPYC 7571
Platform #2: AMD Accelerated Parallel Processing
