# Exercise: Convert the GPU chessboard code to managed memory

In this exercise we will use memory management information in Lesson 6 to convert the chessboard_GPU exercise to use managed memory.

Below is a standard 8x8 chess board:

<figure style="margin: 1em; margin-left:auto; margin-right:auto; width:70%;">
    <img src="../images/Chess_board.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">A chess board of size 8x8.</figcaption>
</figure>

In [1]:
import os
os.environ['PATH'] = f"{os.environ['PATH']}:../../../install/bin"

## The exercise (TLDR version)

In the file [chessboard_mm.f90](chessboard_mm.f90) is the Fortran source, and in the file [kernel_code.cpp](kernel_code.cpp) is the C++ source that contains the `fill_chessboard` kernel. Both source files work and produce a correct result. Your task is to convert the memory allocation steps to use managed memory. The steps required are:

0. Replace calls from `hipMalloc` to `hipMallocManaged` with the right arguments. 
1. Make sure the right memory is being passed to the kernel.
2. Remove the call to hipMemcpy, why?.

## Compile and run the exercise

The code below compiles, installs and runs the `chessboard_GPU` program. Until all the pieces are in place  the code doesn't produce meaningful output.

In [2]:
!build chessboard_mm; run chessboard_mm 

[ 16%] Built target kinds_lib
[ 50%] Built target common_lib
[100%] Built target chessboard_mm
[  1%] Built target kinds_lib
[  5%] Built target common_lib
[  8%] Built target tensoradd_simple
[ 11%] Built target tensoradd_allocatable
[ 15%] Built target tensoradd_pointer
[ 18%] Built target tensoradd_function
[ 25%] Built target tensoradd_module
[ 32%] Built target tensoradd_cfun
[ 33%] Built target kernels_hipfort_example
[ 37%] Built target tensoradd_hip_cptr
[ 40%] Built target tensoradd_hip_fptr
[ 45%] Built target tensoradd_hip_oo
[ 49%] Built target chessboard_CPU_answer
[ 52%] Built target chessboard_CPU
[ 57%] Built target chessboard_GPU
[ 62%] Built target chessboard_GPU_answer
[ 67%] Built target chessboard_mm
[ 72%] Built target chessboard_mm_answer
[ 76%] Built target paged_mem
[ 79%] Built target pinned_mem
[ 83%] Built target managed_mem
[ 86%] Built target memcpy_sync
[35m[1mScanning dependencies of target memcpy_async[0m
[ 89%] Built target memcpy_async
[ 93%] Built

## Compile and run the answer

In the code [chessboard_answer.f90](chessboard_answer.f90) is a simple solution to the problem. You're welcome to check the code for any help you might need.

In [3]:
!build chessboard_mm_answer; run chessboard_mm_answer

[ 16%] Built target kinds_lib
[ 50%] Built target common_lib
[100%] Built target chessboard_mm_answer
[  1%] Built target kinds_lib
[  5%] Built target common_lib
[  8%] Built target tensoradd_simple
[ 11%] Built target tensoradd_allocatable
[ 15%] Built target tensoradd_pointer
[ 18%] Built target tensoradd_function
[ 25%] Built target tensoradd_module
[ 32%] Built target tensoradd_cfun
[ 33%] Built target kernels_hipfort_example
[ 37%] Built target tensoradd_hip_cptr
[ 40%] Built target tensoradd_hip_fptr
[ 45%] Built target tensoradd_hip_oo
[ 49%] Built target chessboard_CPU_answer
[ 52%] Built target chessboard_CPU
[ 57%] Built target chessboard_GPU
[ 62%] Built target chessboard_GPU_answer
[ 67%] Built target chessboard_mm
[ 72%] Built target chessboard_mm_answer
[ 76%] Built target paged_mem
[ 79%] Built target pinned_mem
[ 83%] Built target managed_mem
[ 86%] Built target memcpy_sync
[ 89%] Built target memcpy_async
[ 93%] Built target tensoradd_hip_pinned
[ 96%] Built target te