# Summing matrix coefficients

In this notebook, we synthetise and implement an IP core that allows us to sum the coefficients of a $3\times 3$ matrix. Firstly, make sure you have imported *coef_add.cpp* in Vitis 2024.2 and synthetised and implemented. Also generate the Vivado 2024.2 design and export the bitstream and hardware handoff files: we will use them in this notebook.

## Imports

Firstly, import the libraries to reset the FPGA, import the Bitstream file and interact with the IP Core. 

In [None]:
from pynq import Overlay, DefaultIP, PL
from IPython.display import Image
import time
import numpy as np
import cv2

PL.reset()

## .bit & .hwh files

Import the Overlay with the corresponding *.bit* file.

In [2]:
overlay = Overlay("./sum.bit")

## Interact with the IP Core

We print the keys of the overlay to get the IP core name. In this example, it is named **sum_0**. **processing_system7_0** corresponds to the Zynq SoC.

In [3]:
overlay.ip_dict.keys()

dict_keys(['sum_0', 'processing_system7_0'])

We assign the IP core to a variable to interact with it.

In [4]:
sum_0 = overlay.sum_0

We create a function named **get_register_offset** which allows us to get the address offsets. This allows us to write data in the IP core using AXI Lite.

In [6]:
def get_register_offset(overlay, ip, parameter):
    return overlay.ip_dict[ip]['registers'][parameter]['address_offset']

In [7]:
overlay.ip_dict["sum_0"]['registers']["res"]

{'address_offset': 16,
 'size': 32,
 'access': 'read-only',
 'description': 'Data signal of res',
 'fields': {'res': {'bit_offset': 0,
   'bit_width': 32,
   'access': 'read-only',
   'description': 'Bit 31 to 0 of res'}}}

In [8]:
overlay.ip_dict["sum_0"]['registers']["Memory_mat"]

{'address_offset': 64,
 'size': 64,
 'access': 'read-write',
 'description': 'Memory mat',
 'fields': {}}

For the **res** parameter, its value will be stored in address $16$, *i.e.* $0\text{x}10$, and for **Memory_mat**, it is stored in $64=0\text{x}40$.
We now create a list of values $[\![1, 9]\!]$ which represents a $3\times 3$ matrix.

In [9]:
values = [i for i in range(9)]

Using AXI Lite, we write each value to its respective address. As **Memory_mat**'s beginning address is at $0\text{x}40$, each $i$-th value should be placed at $0\text{x}40+4i$ for $i\in[\![0, 8]\!]. Indeed, a integer is considered as a *word* which length is $4$. 

In [10]:
for i in range(9):
    sum_0.write(0x40 + i*4, values[i])

We start the IP core by setting the value to $1$ at the address $0\text{x}0$.

In [11]:
sum_0.write(0x00, 0x01)

In this example, waiting for the result is not a problem, therefore we can directly read the result. However, in practical applications, it is better to use a DMA to yield the Python code. 

In [None]:
sum_0.read(0x10)

36

In [13]:
sum(values)

36