# Matrix multiplication

This example shows the usage of a 32x32 floating point multiplication in hardware. The data is streamed to and from the IP using an AXI DMA component.

A first of all, the correct bitstream (.bit) and its associated hardware description (.hwh) is loaded onto the FPGA:

In [1]:
from pynq import Overlay, MMIO

overlay = Overlay("/home/xilinx/overlays/matmul.bit")
overlay.download()

The components in the design and all associated metadata can be found in the `ip_dict`.

In [2]:
[key for key in overlay.ip_dict.keys()]

['axi_dma_0', 'matmul_0', 'processing_system7_0']

Next, the input and output matrices are allocated and populated with random data.

In [3]:
from pynq import allocate
import numpy as np

A = allocate(shape=(32,32), dtype=np.single)
B = allocate(shape=(32,32), dtype=np.single)
C = allocate(shape=(32,32), dtype=np.single)

A[:] = np.random.rand(32, 32)
B[:] = np.random.rand(32, 32)

Start the `matmul` IP. This can be done by writing a start (and autorestart) bit to the memory, because all components are memory mapped by default.

In [4]:
overlay.matmul_0.mmio.write(0x0, 0x81)

Stream the A and B matrices to the IP and wait until the response has been streamed back to matrix C.

In [5]:
overlay.axi_dma_0.sendchannel.transfer(A)
overlay.axi_dma_0.sendchannel.wait()
overlay.axi_dma_0.sendchannel.transfer(B)
overlay.axi_dma_0.sendchannel.wait()
overlay.axi_dma_0.recvchannel.transfer(C)
overlay.axi_dma_0.recvchannel.wait()

Now we can verify if there is any difference between the regular software version (using `@`) and the hardware version:

In [6]:
print(A@B)
print(C)

[[10.582283   9.44932    8.677923  ... 10.936913   9.450773  11.509162 ]
 [ 9.852601   7.74787    6.9610806 ...  8.905417   7.4958234  9.296692 ]
 [ 8.124746   7.477977   6.189775  ...  7.8957343  6.4541636  9.025695 ]
 ...
 [ 8.692516   6.999814   6.1991825 ...  7.870747   6.946889   8.9476595]
 [ 7.959147   8.554923   7.000863  ...  8.619805   7.314119   9.380833 ]
 [ 9.062386   8.559791   7.176244  ...  8.954982   8.023294  10.618    ]]
[[10.582283   9.44932    8.677923  ... 10.936913   9.450773  11.509162 ]
 [ 9.852601   7.74787    6.9610806 ...  8.905417   7.4958234  9.296692 ]
 [ 8.124746   7.477977   6.189775  ...  7.8957343  6.4541636  9.025695 ]
 ...
 [ 8.692516   6.999814   6.1991825 ...  7.870747   6.946889   8.9476595]
 [ 7.959147   8.554923   7.000863  ...  8.619805   7.314119   9.380833 ]
 [ 9.062386   8.559791   7.176244  ...  8.954982   8.023294  10.618    ]]


Luckily, they are the same.