# Radio Modulation with FINN - Notebook #4 of 5
This notebook walks you through simple usage of FINN tools. FINN provides users with many tools to perform inference of quantized neural networks on FPGAs. Users could either design their own dataflow-style architechture for their own customized network, or use a template dataflow builder designed by FINN that works with many common type of neural network. 

Because our model (VGG10) is compatible with the template builder that FINN provide, we do not need to design our own dataflow architechture. We can use their template, with a few minor additional steps to handle our 1D convolutional layers.

An example of an dataflow style structure going from a training a Brevitas model to running the bitfile on FPGA: [End-to-end flow](https://finn.readthedocs.io/en/latest/end_to_end_flow.html)

# FINN Dataflow Architechture
From here, we will setup a FINN's standard builder, with a few custom transformations, and export a bitfile which can be run using pynq on the FPGA.

Original version of the code below can be found here: [Original version](https://github.com/Xilinx/finn-examples/blob/main/build/vgg10-radioml/build.py)

Further information about setting up a builder for transformations can be found here: [Tutorial](https://github.com/Xilinx/finn/blob/main/notebooks/end2end_example/cybersecurity/3-build-accelerator-with-finn.ipynb)

## Defining custom steps for the builder
The builder has a few steps already prepared for us. However, since we are using a 1D conv layer, we will need to add 2 more custom steps to convert them from 1D to 2D. FINN works with 4D (NHWC) internally, even with feature maps with only 1 spatial dimension.

`step_pre_streamline` is for converting from our model from 3D tensors to 4D tensors. This is because we initially use 1D convolutional layers. This means the input shape will be changed from `1x2x1024` to `1x2x1024x1`

`step_convert_final_layers` is for converting the final layers (linear and topK) to hardware layers

The code below is from the following finn example: [CustomSteps](https://github.com/Xilinx/finn-examples/blob/main/build/vgg10-radioml/custom_steps.py)

In [1]:
from qonnx.core.modelwrapper import ModelWrapper
from finn.util.visualization import showInNetron

from qonnx.transformation.change_3d_tensors_to_4d import Change3DTo4DTensors
from qonnx.transformation.general import GiveUniqueNodeNames

import finn.transformation.fpgadataflow.convert_to_hw_layers as to_hw
import finn.transformation.streamline.absorb as absorb
from finn.builder.build_dataflow_config import DataflowBuildConfig
import finn.builder.build_dataflow as build
import finn.builder.build_dataflow_config as build_cfg
from finn.util.basic import alveo_default_platform

def step_pre_streamline(model: ModelWrapper, cfg: DataflowBuildConfig):
    model = model.transform(Change3DTo4DTensors())
    model = model.transform(absorb.AbsorbScalarMulAddIntoTopK())
    return model


def step_convert_final_layers(model: ModelWrapper, cfg: DataflowBuildConfig):
    model = model.transform(to_hw.InferChannelwiseLinearLayer())
    model = model.transform(to_hw.InferLabelSelectLayer())
    model = model.transform(GiveUniqueNodeNames())
    return model

## Setting up Dataflow Builder

Define the path to the ONNX model and the target platform

In this example:
- We will use the `tidy.onnx` model that has just gone through the `network-surgery` from previous step (`notebook 3/5`)
- The only target platform we will be using for this example is `ZCU104` 

In [2]:
from datetime import datetime
import os
dt=datetime.today().strftime('%Y_%m_%d')
#Get the tidy.onnx model
model_name='radio_27ml_tidy'
#include date and random hex code to avoid duplicate file when output 
final_name=model_name+"_"+dt+"_"+os.urandom(3).hex()+"/"
model_file = '27ml_rf/models/radio_27ml_tidy.onnx'

# which platforms to build the networks for
zynq_platforms = ["ZCU104"]
alveo_platforms = []
platforms_to_build = zynq_platforms + alveo_platforms

# determine which shell flow to use for a given platform
# Since we are using ZCU104, it should return VIVADO_ZYNQ
def platform_to_shell(platform):
    if platform in zynq_platforms:
        return build_cfg.ShellFlowType.VIVADO_ZYNQ
    elif platform in alveo_platforms:
        return build_cfg.ShellFlowType.VITIS_ALVEO
    else:
        raise Exception("Unknown platform, can't determine ShellFlowType")

When FINN is building the bitfile, it will create multiple intermediate files that show the progress throughout the steps. These files will be stored in the directory which is assigned to the environment variable `FINN_BUILD_DIR`. 

For this example, we will create the `tmp/` in our workspace and assign to `FINN_BUILD_DIR` for easy access. 

In case the `tmp/` directory has already been generated, we will clear its content everytime we do a new run, so that we only keep the lastest intermediate files from that run.

**Notice:** The `tmp/` directory is not commited onto our github repository to save space

In [3]:
import os
import shutil
import glob
from pathlib import Path

#Create a temporary folders (if none exist) to store intermediate transformations
#This folder will be where all intermediate files generated by running VIVADO for synthesis
finn_build_dir=os.getcwd()+'/tmp/'
os.environ["FINN_BUILD_DIR"]=finn_build_dir

final_output_dir="output/"
Path(finn_build_dir).mkdir(exist_ok=True)
Path(final_output_dir).mkdir(exist_ok=True)

#Remove all intermediate transformations from previous runs 
print(f'temp files will be built in {finn_build_dir}')
print(f'removing old temp files in {finn_build_dir}')
files = glob.glob(f'{finn_build_dir}*')
for f in files:
    if os.path.isdir(f):
        shutil.rmtree(f)
    elif os.path.isfile(f):
        os.remove(f)


print(f'output will be generated in {final_output_dir}')

temp files will be built in /home/phu/repos/radio_finn_latest/RadioFINN/notebooks/Radio_27ML/tmp/
removing old temp files in /home/phu/repos/radio_finn_latest/RadioFINN/notebooks/Radio_27ML/tmp/
output will be generated in output/


## Setting Parameters for the Dataflow Architechture.

For this example, we define:
1. `target fps`: Target inference performance in frames per second.
2. `clock period`: Target clock period (in nanosecond) for Vivado synthesis.
3. `select_build_steps`: The architechture of our build flow, going from the onnx model to the bitfile that can be run on FPGA.
4. `select_generate_output`: What information about the product we want to see.
    - Documentation on what the generated outputs mean: [Generated Outputs](https://finn.readthedocs.io/en/latest/command_line.html#generated-outputs)

    
Documentation for parameters can be found here: [BuildConfig](https://finn.readthedocs.io/en/latest/source_code/finn.builder.html#finn.builder.build_dataflow_config.DataflowBuildConfig)



In [4]:
# Target inference performance in frames per second
def select_target_fps(platform):
    return 4500

# Target clock period (in nanoseconds) for Vivado synthesis.
# Frequency (MHz) = 1000 / clock_period_ns 
# e.g. synth_clk_period_ns=5.0 will target a 200 MHz clock.
def select_clk_period(platform):
    return 5.0 

# assemble build flow from custom and pre-existing steps
def select_build_steps(platform):
    return [
        #------------Network-Preparation------
        "step_tidy_up",
        step_pre_streamline, #Custom steps above
        "step_streamline",
        "step_convert_to_hw",
        step_convert_final_layers,  #Custom steps above
        "step_create_dataflow_partition",
        "step_specialize_layers",
        "step_target_fps_parallelization",
        "step_apply_folding_config",
        "step_minimize_bit_width",  
        "step_generate_estimate_reports",
        #------------Hardware-Build-(finn generate instruction files for VITIS HLS)----
        "step_hw_codegen",
        "step_hw_ipgen",
        "step_set_fifo_depths",
        "step_create_stitched_ip",
        #------------HW-synthesis--------------------------
        "step_measure_rtlsim_performance",
        "step_out_of_context_synthesis",
        "step_synthesize_bitfile",
        "step_make_pynq_driver",
        "step_deployment_package",
    ]
    
#What information we want to see.
def select_generate_output(platform):
    return [
        build_cfg.DataflowOutputType.ESTIMATE_REPORTS,
        build_cfg.DataflowOutputType.STITCHED_IP,
        build_cfg.DataflowOutputType.RTLSIM_PERFORMANCE,
        build_cfg.DataflowOutputType.BITFILE, #This is how we tell the builder to generate the bitfile
        build_cfg.DataflowOutputType.DEPLOYMENT_PACKAGE,
        build_cfg.DataflowOutputType.PYNQ_DRIVER, 
    ]

## Setup the `start_dataflow` function.
- The input being the `platform_name`. In our example, this would be `ZCU104`
- The function goes through 3 major steps:
    1. Get the `release platform name`, `shell flow type`, and `vitis platform` and create a directory which will store its bitfile.
    2. Set up a config for the builder based on the output from step 1.
    3. Start running the architechture
- The output is the `config file` and the `output directory`


In [5]:
def start_dataflow(platform_name):
    '-----------------------Get the platform of the target board--------------------------'
    shell_flow_type = platform_to_shell(platform_name)
    if shell_flow_type == build_cfg.ShellFlowType.VITIS_ALVEO:
        vitis_platform = alveo_default_platform[platform_name]
        # for Alveo, use the Vitis platform name as the release name
        # e.g. xilinx_u250_xdma_201830_2
        release_platform_name = vitis_platform
    else:
        vitis_platform = None
        # for Zynq, use the board name as the release name
        # e.g. ZCU104
        release_platform_name = platform_name
    # platform_dir = "release/%s" % release_platform_name
    # os.makedirs(platform_dir, exist_ok=True)
    
    '-----------------------Define the config for the build architechture---------------'
    cfg = build_cfg.DataflowBuildConfig(
        steps=select_build_steps(platform_name),
        output_dir=final_output_dir+"output_%s_%s" % (final_name, release_platform_name),
        synth_clk_period_ns=select_clk_period(platform_name),
        target_fps=select_target_fps(platform_name), #Target FPS, not guaranteed the model will achieve
        board=platform_name,
        shell_flow_type=shell_flow_type,
        vitis_platform=vitis_platform,
        split_large_fifos=True,
        standalone_thresholds=True,
        # enable extra performance optimizations (physopt)
        vitis_opt_strategy=build_cfg.VitisOptStrategyCfg.PERFORMANCE_BEST,
        generate_outputs=select_generate_output(platform_name),        
    )
    
    '-----------------------Start the build flow--------------------------------------------'
    # Start the build flow, with the input being the [onnx model] and the [config file]
    build.build_dataflow_cfg(model_file, cfg)
    
    return cfg#,platform_dir

## Organize output files
After running `start_dataflow()`, the output bitfile is generated, but can be tedious to find. 
This codes below will go look for the output files and copy them to the `release\[platform_name]\` directory

`finn-accel.(bit|xclbin)`: generated Bitfile depending on the target platform

`finn-accel.hwh`: generated Hardware Handoff File

In [6]:
def organize_output_files(cfg):
    # copy output deploy packages and rename bitfile
    deploy_gen_dir = cfg.output_dir + "/deploy"
    new_deploy_dir="deploy/"+cfg.board+"_"+final_name
    print('directory needed for FPGA: '+new_deploy_dir)
    files_to_check_and_rename = [
        "finn-accel.bit",
        "finn-accel.hwh",
        "finn-accel.xclbin",
    ]
    print(new_deploy_dir)
    #copy output/[model]/deploy to /deploy
    if os.path.exists(deploy_gen_dir):
        shutil.copytree(deploy_gen_dir,new_deploy_dir)
    Path(new_deploy_dir+'/datasets').mkdir()
    shutil.copy("Tutorial5_Load_Bitsteam_on_FPGA.ipynb",new_deploy_dir+"/driver/")
    #rename all bit file to its model name for better readability
    for f in files_to_check_and_rename:
        src_file = new_deploy_dir + "/" + "/bitfile/"+f
        new_file = src_file.replace("finn-accel", final_name)
        if os.path.isfile(src_file):
            os.rename(src_file,new_file)
            shutil.copy(new_file,new_deploy_dir+"/driver/")

## Start running the architechture
We will iterate through all platform assigned in `platforms_to_build`, and run the dataflow architechture

In [7]:
print(platforms_to_build)

['ZCU104']


In [None]:
# create a release dir, used for finn-examples release packaging
os.makedirs("release", exist_ok=True)

# Iterate through all target platform
# In this example, we only have 1 target platform (ZCU104)
for platform_name in platforms_to_build:
    
    cfg=start_dataflow(platform_name)

    organize_output_files(cfg)

Building dataflow accelerator from 27ml_rf/models/radio_27ml_tidy.onnx
Intermediate outputs will be generated in /home/phu/repos/radio_finn_latest/RadioFINN/notebooks/Radio_27ML/tmp/
Final outputs will be generated in output/output_radio_27ml_tidy_2025_02_26_25d933/_ZCU104
Build log is at output/output_radio_27ml_tidy_2025_02_26_25d933/_ZCU104/build_dataflow.log
Running step: step_tidy_up [1/20]
Running step: step_pre_streamline [2/20]
Running step: step_streamline [3/20]
Running step: step_convert_to_hw [4/20]
Running step: step_convert_final_layers [5/20]
Running step: step_create_dataflow_partition [6/20]
Running step: step_specialize_layers [7/20]
Running step: step_target_fps_parallelization [8/20]
Running step: step_apply_folding_config [9/20]
Running step: step_minimize_bit_width [10/20]
Running step: step_generate_estimate_reports [11/20]
Running step: step_hw_codegen [12/20]
Running step: step_hw_ipgen [13/20]


## Now that we have the generated bitfile, we can perform inference on the FPGA.

1. <ins>First we need to zip the generated directory `deploy/[your platform+model]`. There are many ways to do it. Here we can do it in the terminal:</ins>

```bash
cd [path_to_deploy/]
zip -r [zip_name].zip [name of folder needed zipping]
```
2. <ins>Then we can copy this zip file onto the FPGA. We can do that with the `scp` command</ins>

```bash
scp [/path/to/zip/file/you/want/to/copy] username@remoteaddress:[where/to/put/file/on/fpga]
```
3. <ins>On the FPGA, once we verify that zip file is copied, we can unzip it with the command on the FPGA terminal:</ins>

```bash
unzip [filename].zip -d [destination]
```
4. <ins>To run the notebook on the FPGA, we can run the following command on the FPGA terminal:</ins>

```bash
sudo -E jupyter lab -p 8888 --allow-root
```

# About The Report

<font color=orange> **NOTE**: Do not remove all the generated files in `/tmp` yet. We will need them for running implemetation on VIVADO </font>


### FINN Generated Reports
Inside the generated output folder, (eg. `output/[model_name]/_ZCU104/report/`), there will be estimated reports generated by finn.

Estimated performance (**throughput fps, latency in ns, node with highest cycle**, ...) can be found in __estimate_network_performance.json__

### Run Implematation with VIVADO
Aside from generated reports, we can also run implemetation on the generated VIVADO project of the model to get **LUT**, **FF** and **BRAM** utilization

Ensure the final generated VIVADO project can be found in the `output/[model_name]/[platform]/stiched_ip/finn_vivado_stitch_proj.xpr`

We can now open VIVADO, and open a project that has the path pointing at the `stich_proj.xpr` above.

Once the project is opened, we can run synthesis and implementation on VIVADO, which will give us the **Utilization reports**
<br>
<br>
<br>
<img src="ref_images\VIVADO_run_implementation.png" width="1200"/>