# Debugging A Simple Image Processing Program Using The Compute Sanitizer
## 03 Memcheck part 1

In this section, we will introduce the memcheck tool from compute-sanitizer.
Memcheck is used to detect invalid memory access on the GPU, that is to say,
memory accesses that are outside of a valid allocation, or memory accesses that are
misaligned, which result in undefined behavior and may lead to application
crashes or invalid results and corruption.

Let's build then run the application with the following commands:

```sh
    nvcc convolution.cu -lpng -o convolution
    ./convolution
```

We currently run an edge-detection convolution matrix on the image
[checkerboard.png](code/checkerboard.png) and an identity matrix on [icon.png](code/icon.png). The expected results
are the following:

![Checkerboard ref](images/checkerboard-ref.png)
![Icon ref](images/icon.png)

However, when we open [checkerboard-out.png](code/checkerboard-out.png) and [icon-out.png](code/icon-out.png) we can observe the following:

![Checkerboard fail 1](images/checkerboard-fail1.png)
![Icon fail 2](images/icon-fail2.png)

In this case, we can suspect some form of memory corruption or a programming
error that would lead to invalid memory access and try to run the compute-sanitizer memcheck tool. We can run memcheck with the following command:

```sh
    compute-sanitizer ./convolution
```

Note that the `--tool memcheck` option is not necessary as memcheck is the
default tool of the compute-sanitizer. We can see the following output:

```
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 1 bytes
=========     at 0x130 in convolution(pixel const*, int, int, float const*, int, pixel*)
=========     by thread (0,1,0) in block (0,0,0)
=========     Address 0x7f8efeffff81 is out of bounds
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame: [0x27e53a]
=========                in /usr/lib/x86_64-linux-gnu/libcuda.so.1
=========     Host Frame: [0xf53b]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x5af60]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x9e3e]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x9c7b]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x9ce8]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x9298]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x98ef]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame:__libc_start_main [0x270b3]
=========                in /lib/x86_64-linux-gnu/libc.so.6
=========     Host Frame: [0x8c0e]
=========                in /home/nvidia/sanitizer-demo/./convolution
========= 
    [...]
```

We can observe that the tool reports several invalid accesses in our kernel,
more specifically invalid reads in global memory. However, we forgot something
that could be helpful: when we compiled the application for the first time, we
did not include debug flags. compute-sanitizer is able to leverage the debug
information added by debug flags to give more precise information. Let's
recompile and run the compute-sanitizer again with the following commands:

```sh
    nvcc -Xcompiler -rdynamic -g -G -lpng convolution.cu -o convolution
    compute-sanitizer ./convolution
```

Alternatively, we could have used `-lineinfo` to avoid the `-G` flag, if we
wanted to avoid impacting the optimization and still have our line information.
We get the following output:

```
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 1 bytes
=========     at 0x1c30 in /home/nvidia/sanitizer-demo/convolution.cu:63:convolution(pixel const*, int, int, float const*, int, pixel*)
=========     by thread (3,3,0) in block (0,0,0)
=========     Address 0x7f0844fffe8c is out of bounds
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame: [0x27e53a]
=========                in /usr/lib/x86_64-linux-gnu/libcuda.so.1
=========     Host Frame: [0x1053b]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0x5bf60]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0xae3e]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame:__device_stub__Z11convolutionPK5pixeliiPKfiPS_(pixel const *, int, int, float const *, int, pixel*) [0xac7b]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame:convolution(pixel const *, int, int, float const *, int, pixel*) [0xace8]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame: [0xa298]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame:main [0xa8ef]
=========                in /home/nvidia/sanitizer-demo/./convolution
=========     Host Frame:__libc_start_main [0x270b3]
=========                in /lib/x86_64-linux-gnu/libc.so.6
=========     Host Frame:_start [0x9c0e]
=========                in /home/nvidia/sanitizer-demo/./convolution
========= 
[...]

```

We can now observe that on each invalid access, the file and line of the invalid
access are now specified. By inspecting the code at that line, we can realize
our mistake. Please take a moment to find it. Unfold the next cell to see the solution:

The solution is that on line 63, the index of `image` is incorrect. It
should be `image[cXidx + cYIdx * width]`. Correct the mistake, recompile, and
try to run the application again.

We can observe the following for [checkerboard-out.png](code/checkerboard-out.png):

![Checkerboard fail 2](images/checkerboard-fail2.png)

And this for [icon-out.png](code/icon-out.png) (zoomed-in):

![Icon fail](images/icon-fail.png)

We can see improvement, but can notice that this is still not the exactly expected
result, as some pixels do not have the correct color in the left side of the picture. We can also observe variation in the results between several runs.

[Go to step 04](04_memcheck2.ipynb) to continue debugging.