# Introduction to Building Bitstream with Vitis

This notebook introduces how to use the Vitis command line to build xclbin files. PYNQ works with any xclbin file so the steps here are not PYNQ-specific. The aim is to produce an xclbin file that contains a single vector addition kernel that can be used with the introductory notebooks.

The following prerequisites should be installed: Vitis; and a Alveo development platform.

## Writing the Kernel in C

As the first example we are going to create a simple accelerator to add together two arrays of numbers. First we define the prototype of the function which will take three array pointers (two for the inputs and one for the output) and the number of elements to operate on.

```C
    void vadd(int* in_a, int* in_b, int* out_c, int count) {
```

Next we need to tell the compiler how the arguments should be interpreted. There are two main types of argument in Vivado HLS that we need to consider: AXI-lite slave connections which are generally used for registers that provide configuration and AXI master connections which the accelerator uses to access large blocks of data. For `vadd` the three array arguments map well to AXI master ports as the acccelerator will be reading and writing data from memory as well as an AXI-lite port to set the address of the buffer. The `count` argument just needs an AXI-lite register. These types are specified using pragmas in the source code.

```C
    #pragma HLS INTERFACE m_axi port=in_a offset=slave
    #pragma HLS INTERFACE s_axilte port=in_a bundle=control
    #pragma HLS INTERFACE m_axi port=in_b offset=slave
    #pragma HLS INTERFACE s_axilte port=in_b bundle=control
    #pragma HLS INTERFACE m_axi port=out_c offset=slave
    #pragma HLS INTERFACE s_axilte port=out_c bundle=control
    #pragma HLS INTERFACE s_axilite port=count bundle=control
```

We also need to specify how the accelerator should be controlled. For a Vitis accelerator this should usually be via the same AXI-lite interface as other arguments. This is specified using a pragma tied to the `return` port

```C
    #pragma HLS INTERFACE s_axilite port=return bundle=control
```

The reste of the function is a simple loop that performs the addition

```C
        for (int i = 0; i < count; ++i) {
            *out_c++ = *in_a++ + *in_b++;
        }
    }
```

Writing all of into a single file we end up with the following file.

In [None]:
%%writefile vadd.c

void vadd(int* in_a, int* in_b, int* out_c, int count) {
#pragma HLS INTERFACE m_axi port=in_a offset=slave
#pragma HLS INTERFACE s_axilite port=in_a bundle=control
#pragma HLS INTERFACE m_axi port=in_b offset=slave
#pragma HLS INTERFACE s_axilite port=in_b bundle=control
#pragma HLS INTERFACE m_axi port=out_c offset=slave
#pragma HLS INTERFACE s_axilite port=out_c bundle=control
#pragma HLS INTERFACE s_axilite port=count bundle=control
#pragma HLS INTERFACE s_axilite port=return bundle=control
    for (int i = 0; i < count; ++i) {
        *out_c++ = *in_a++ + *in_b++;
    }
}

## Compiling the accelerator

Creating the bitstream with Vitis is split into two phases analagous to a conventional C or C++ flow. First each kernel is compiled to create a `.xo` file after which the compiled kernels are linked together to generate the `.xclbin` file.

To compile the kernel we first need to set up the environment. For this we need to specify the platform that we should build against. In this example we point at the U200 DSA installed in it's default location.

In [None]:
import glob

platform = glob.glob("/opt/xilinx/platforms/*/*.xpfm")[0]

Next we call `xocc` with the `-c` option to compile the kernel. The other parameters are

 * `--kernel` - The name of the kernel in the source file as multiple kernels can be contained in a single input
 * `-f platform` - The platform to compile the kernel for
 * `-o file` - The name of the output object to write
 * `-t type` - The type of design to create - in this case hardware
 
In addition to the Vitis specific options most GCC options can be passed - e.g. `-I` for include paths and `-D` for preprocessor directives

In [None]:
!v++ -c vadd.c -t hw --kernel vadd -f $platform -o vadd.xo

## Linking the system

The final step for creating the `.xclbin` is linking the system which uses `xocc` with the `-l` flag. As we are happy with the default options we only need to provide the following arguments

 * `-o` to again specify the output file name
 * `-f` to specify the platform
 * `-t` to specify the compile type which should match the compile step
 * The `.xo` files to link together
 
Note that this can take a considerable amount of time depending on the power of the build computer.

In [None]:
!v++ -l -t hw -o vadd.xclbin -f $platform vadd.xo

## Testing the design with PYNQ

Now we can try our newly generated bitstream with PYNQ by re-using the code present in the introductory notebook.

In [None]:
import pynq
import numpy as np

ol = pynq.Overlay('vadd.xclbin')

vadd = ol.vadd_1

in1 = pynq.allocate((1024,), 'u4')
in2 = pynq.allocate((1024,), 'u4')
out = pynq.allocate((1024,), 'u4')

in1[:] = np.random.randint(low=0, high=100, size=(1024,), dtype='u4')
in2[:] = 200

in1.sync_to_device()
in2.sync_to_device()

vadd.call(in1, in2, out, 1024)

out.sync_from_device()
np.array_equal(in1 + in2, out)

For more details on how to rebuild the rest of the bitstreams used in the example notebooks check out the repository at https://github.com/Xilinx/Alveo-PYNQ

Copyright (C) 2020 Xilinx, Inc