# FINN Instrumentation Wrapper Flow (Part 1/2)
#### **NOTE: Make sure the Jupyter server was started in your FINN repository using the command `./run-docker notebook`, the FINN builds will fail otherwise.**

This Jupyter notebook will build a simple model and platform to be used in running the instrumentation wrapper.

## Build the model using the FINN compiler
The build flow is similar to the flows described within the other FINN and FINN-examples notebooks. However, there are additional steps added to the default build_dataflow:\
\
`test_step_gen_vitis_xo`\
`test_step_gen_instrumentation_wrapper`\
\
These steps will generate additional output products, namely `.xo` kernel IP files which will be used to link the FINN design and instrumentation wrapper to the hardware platform and allow it to be run through Vitis. \
\
First, the necessary modules are imported and the model file is given. The model which we will be using is TFC-w1a1, trained on the MNIST dataset. The board name and part are also given. In this case we are targeting the VMK180 from the Versal Prime series, though other Versal boards such as the VCK190 may also be compatible with this build flow. Then, the platform Vitis IP directory to which the `.xo` files will be copied into in order to build the Vitis platform is given.

In [None]:
##
# Copyright (C) 2023, Advanced Micro Devices, Inc. All rights reserved.
##

import numpy as np
import os
import shutil
from qonnx.custom_op.registry import getCustomOp

import finn.builder.build_dataflow as build
import finn.builder.build_dataflow_config as build_cfg
import finn.util.data_packing as dpk
from finn.custom_op.fpgadataflow.templates import ipgentcl_template
from finn.transformation.fpgadataflow.vitis_build import CreateVitisXO
from finn.util.hls import CallHLS

model_file = "model.onnx"
model_name = "tfc_w1a1"

platform_name = "VMK180"
fpga_part = "xcvm1802-vsva2197-2MP-e-S"

vitis_ip_dir = "instr_wrap_platform/vitis/ip"

The aforementioned additional steps are then defined.
\
\
`test_step_gen_vitis_xo` will take the stitched model created using the FINN compiler and generate the `.xo` file for the FINN design.

In [None]:
def test_step_gen_vitis_xo(model, cfg):
    xo_dir = cfg.output_dir + "/xo"
    xo_dir = str(os.path.abspath(xo_dir))
    os.makedirs(xo_dir, exist_ok=True)
    model = model.transform(CreateVitisXO())
    xo_path = model.get_metadata_prop("vitis_xo")
    shutil.copy(xo_path, xo_dir)
    return model

`test_step_gen_instrumentation_wrapper` will first get the input and output properties of the FINN model. It will then use these values to fill out the template found in `templates/instrumentation_wrapper.template.cpp`, and save the filled template to an output file. It will also fill out a template and save a `.tcl` file for use in HLS synthesis of the instrumentation wrapper. These files will then be used to generate the `.xo` file for the instrumentation wrapper.

In [None]:
def test_step_gen_instrumentation_wrapper(model, cfg):
    xo_dir = cfg.output_dir + "/xo"
    xo_dir = str(os.path.abspath(xo_dir))
    os.makedirs(xo_dir, exist_ok=True)
    wrapper_output_dir = cfg.output_dir + "/instrumentation_wrapper"
    wrapper_output_dir = str(os.path.abspath(wrapper_output_dir))
    os.makedirs(wrapper_output_dir, exist_ok=True)
    # conservative max for pending feature maps: number of layers
    pending = len(model.graph.node)
    # query the parallelism-dependent folded input shape from the
    # node consuming the graph input
    inp_name = model.graph.input[0].name
    inp_node = getCustomOp(model.find_consumer(inp_name))
    inp_shape_folded = list(inp_node.get_folded_input_shape())
    inp_stream_width = inp_node.get_instream_width_padded()
    # number of beats per input is given by product of folded input
    # shape except the last dim (which is the stream width)
    ilen = np.prod(inp_shape_folded[:-1])
    ti = "ap_uint<%d>" % inp_stream_width
    # perform the same for the output
    out_name = model.graph.output[0].name
    out_node = getCustomOp(model.find_producer(out_name))
    out_shape_folded = list(out_node.get_folded_output_shape())
    out_stream_width = out_node.get_outstream_width_padded()
    olen = np.prod(out_shape_folded[:-1])
    to = "ap_uint<%d>" % out_stream_width
    ko = out_shape_folded[-1]
    # fill out instrumentation wrapper template
    with open("templates/instrumentation_wrapper.template.cpp", "r") as f:
        instrwrp_cpp = f.read()
    instrwrp_cpp = instrwrp_cpp.replace("@PENDING@", str(pending))
    instrwrp_cpp = instrwrp_cpp.replace("@ILEN@", str(ilen))
    instrwrp_cpp = instrwrp_cpp.replace("@OLEN@", str(olen))
    instrwrp_cpp = instrwrp_cpp.replace("@TI@", str(ti))
    instrwrp_cpp = instrwrp_cpp.replace("@TO@", str(to))
    instrwrp_cpp = instrwrp_cpp.replace("@KO@", str(ko))
    with open(wrapper_output_dir + "/top_instrumentation_wrapper.cpp", "w") as f:
        f.write(instrwrp_cpp)
    # fill out HLS synthesis tcl template
    prjname = "project_instrwrap"
    ipgentcl = ipgentcl_template
    ipgentcl = ipgentcl.replace("$PROJECTNAME$", prjname)
    ipgentcl = ipgentcl.replace("$HWSRCDIR$", wrapper_output_dir)
    ipgentcl = ipgentcl.replace("$TOPFXN$", "instrumentation_wrapper")
    ipgentcl = ipgentcl.replace("$FPGAPART$", cfg._resolve_fpga_part())
    ipgentcl = ipgentcl.replace("$CLKPERIOD$", str(cfg.synth_clk_period_ns))
    ipgentcl = ipgentcl.replace("$DEFAULT_DIRECTIVES$", "")
    ipgentcl = ipgentcl.replace("$EXTRA_DIRECTIVES$", "config_export -format xo")
    # use Vitis RTL kernel (.xo) output instead of IP-XACT
    ipgentcl = ipgentcl.replace("export_design -format ip_catalog", "export_design -format xo")
    with open(wrapper_output_dir + "/hls_syn.tcl", "w") as f:
        f.write(ipgentcl)
    # build bash script to launch HLS synth and call it
    code_gen_dir = wrapper_output_dir
    builder = CallHLS()
    builder.append_tcl(code_gen_dir + "/hls_syn.tcl")
    builder.set_ipgen_path(code_gen_dir + "/{}".format(prjname))
    builder.build(code_gen_dir)
    ipgen_path = builder.ipgen_path
    assert os.path.isdir(ipgen_path), "HLS IPGen failed: %s not found" % (ipgen_path)
    ip_path = ipgen_path + "/sol1/impl/ip"
    assert os.path.isdir(ip_path), "HLS IPGen failed: %s not found. Check log under %s" % (
        ip_path,
        code_gen_dir,
    )
    xo_path = code_gen_dir + "/{}/sol1/impl/export.xo".format(prjname)
    xo_instr_path = xo_dir + "/instrumentation_wrapper.xo"
    shutil.copy(xo_path, xo_instr_path)

    return model

With the additional steps defined, they can then be appended to the build flow. The other necessary configurations for the build will also be set.

In [None]:
build_steps = build_cfg.default_build_dataflow_steps + [
    test_step_gen_vitis_xo,
    test_step_gen_instrumentation_wrapper,
]

build_steps.remove("step_specialize_to_rtl")

cfg = build.DataflowBuildConfig(
    steps=build_steps,
    board=platform_name,
    fpga_part=fpga_part,
    output_dir="output_%s_%s" % (model_name, platform_name),
    synth_clk_period_ns=3.3,
    folding_config_file="folding_config.json",
    stitched_ip_gen_dcp=False,
    generate_outputs=[
        build_cfg.DataflowOutputType.STITCHED_IP,
    ],
    save_intermediate_models=True,
)

Finally, the build will be launched. This will take a few minutes.

In [None]:
build.build_dataflow_cfg(model_file, cfg)

The build outputs, including the intermediate models and estimate reports, can be found in the `output_tfc_w1a1_<PLATFORM_NAME>` folder.

## Build the instrumentation wrapper platform
With the FINN model built, the generated output `.xo` files can then be used to build the platform on which the instrumentation wrapper will be run. This will be done through the use of `Makefiles` and corresponding `make` commands. \
\
The editable variables which are used in the build are defined within the top-level `Makefile` in the `thin_platform` folder.

In [None]:
!sed -n 17,38p instr_wrap_platform/Makefile

The target for this build is hardware, and the Vivado ILA (integrated logic analyser) will not be used. The board we are targeting is the VMK180, which is connected to a remote machine with a hw_server set up for it. The design will be singled-pumped (only one clock used to drive it), and the clock frequency will be 200MHz (the other clock will not be used in this case, but it is generally double the frequency of the slower clock). \
\
These default values will suffice for the build alone. However, in order to run the instrumentation wrapper we will need to connect to the board via the hw_server, so the hw_server variables should be changed to your own hw_server parameters. This can be done by opening, editing and saving the Makefile from the Jupyter notebook `instr_wrap_platform` folder.

To start with the platform build, first the `.xo` files will be copied to the Vitis IP folder.

In [None]:
%%sh
cp output_tfc_w1a1_VMK180/xo/finn_design.xo instr_wrap_platform/vitis/ip/finn_design/src
cp output_tfc_w1a1_VMK180/xo/instrumentation_wrapper.xo instr_wrap_platform/vitis/ip/instrumentation_wrapper/src

Then, the necessary `make` commands will be run from the root of the platform directory. `make help` can be run to get a brief explanation on what each `make` rule does.

In [None]:
%%sh
cd instr_wrap_platform
make help

\
If there were any builds run previously, `make clean` should be run to remove all outputs generated by previous builds, so that a fresh build can be started. This prevents old or partial builds from potentially interfering with the new build.

In [None]:
%%sh
cd instr_wrap_platform
make clean

To build the instrumentation wrapper, the 4 `make` commands \
`make vivado_platform`\
`make vitis_platform`\
`make vitis_ip`\
`make full_impl`\
must be run in succession.

Alternatively, instead of running each command separately, the `make all` command can be used to run the aforementioned 4 steps in succession. \
\
The build will take a few minutes to complete.

In [None]:
%%sh
cd instr_wrap_platform
make all

Once the build has finished, the generated outputs can be found in the `instr_wrap_platform/vitis/build_hw` folder. The full Vivado project `prj.xpr` can be found in `build_hw/_x/link/vivado/vpl/prj`. The final platform block design can be viewed in this project. Vivado features such as utilisation reports can also be run to view the resource usage and other metrics of the platform.

## Alternative Method: Build the instrumentation wrapper as part of the FINN build flow
Alternatively, instead of building the platform separately to the FINN model, additional steps could be appended to the FINN build flow to build the platform after the FINN model has finished compiling and the corresponding `.xo` files have been generated. These steps would simply call the necessary `make` commands through the Python `subprocess` module. This way, both the FINN model and the instrumentation wrapper platform could be built in one go, rather than having to run the builds separately. The cell below shows how this could be done, and can be run by uncommenting the code by removing the `"""` at the start and the `""";` at the end.

In [None]:
"""
# An alternative method of building the instrumentation wrapper platform
# by appending steps to call the `make` commands to the FINN build flow

import numpy as np
import os
import shutil
import subprocess
from qonnx.custom_op.registry import getCustomOp

import finn.builder.build_dataflow as build
import finn.builder.build_dataflow_config as build_cfg
import finn.util.data_packing as dpk
from finn.custom_op.fpgadataflow.templates import ipgentcl_template
from finn.transformation.fpgadataflow.vitis_build import CreateVitisXO
from finn.util.hls import CallHLS

model_file = "model.onnx"
model_name = "tfc_w1a1"

platform_name = "VMK180"
fpga_part = "xcvm1802-vsva2197-2MP-e-S"

vitis_ip_dir = "instr_wrap_platform/vitis/ip"

def test_step_gen_vitis_xo(model, cfg):
    xo_dir = cfg.output_dir + "/xo"
    xo_dir = str(os.path.abspath(xo_dir))
    os.makedirs(xo_dir, exist_ok=True)
    model = model.transform(CreateVitisXO())
    xo_path = model.get_metadata_prop("vitis_xo")
    shutil.copy(xo_path, xo_dir)
    return model


def test_step_gen_instrumentation_wrapper(model, cfg):
    xo_dir = cfg.output_dir + "/xo"
    xo_dir = str(os.path.abspath(xo_dir))
    os.makedirs(xo_dir, exist_ok=True)
    wrapper_output_dir = cfg.output_dir + "/instrumentation_wrapper"
    wrapper_output_dir = str(os.path.abspath(wrapper_output_dir))
    os.makedirs(wrapper_output_dir, exist_ok=True)
    # conservative max for pending feature maps: number of layers
    pending = len(model.graph.node)
    # query the parallelism-dependent folded input shape from the
    # node consuming the graph input
    inp_name = model.graph.input[0].name
    inp_node = getCustomOp(model.find_consumer(inp_name))
    inp_shape_folded = list(inp_node.get_folded_input_shape())
    inp_stream_width = inp_node.get_instream_width_padded()
    # number of beats per input is given by product of folded input
    # shape except the last dim (which is the stream width)
    ilen = np.prod(inp_shape_folded[:-1])
    ti = "ap_uint<%d>" % inp_stream_width
    # perform the same for the output
    out_name = model.graph.output[0].name
    out_node = getCustomOp(model.find_producer(out_name))
    out_shape_folded = list(out_node.get_folded_output_shape())
    out_stream_width = out_node.get_outstream_width_padded()
    olen = np.prod(out_shape_folded[:-1])
    to = "ap_uint<%d>" % out_stream_width
    ko = out_shape_folded[-1]
    # fill out instrumentation wrapper template
    with open("templates/instrumentation_wrapper.template.cpp", "r") as f:
        instrwrp_cpp = f.read()
    instrwrp_cpp = instrwrp_cpp.replace("@PENDING@", str(pending))
    instrwrp_cpp = instrwrp_cpp.replace("@ILEN@", str(ilen))
    instrwrp_cpp = instrwrp_cpp.replace("@OLEN@", str(olen))
    instrwrp_cpp = instrwrp_cpp.replace("@TI@", str(ti))
    instrwrp_cpp = instrwrp_cpp.replace("@TO@", str(to))
    instrwrp_cpp = instrwrp_cpp.replace("@KO@", str(ko))
    with open(wrapper_output_dir + "/top_instrumentation_wrapper.cpp", "w") as f:
        f.write(instrwrp_cpp)
    # fill out HLS synthesis tcl template
    prjname = "project_instrwrap"
    ipgentcl = ipgentcl_template
    ipgentcl = ipgentcl.replace("$PROJECTNAME$", prjname)
    ipgentcl = ipgentcl.replace("$HWSRCDIR$", wrapper_output_dir)
    ipgentcl = ipgentcl.replace("$TOPFXN$", "instrumentation_wrapper")
    ipgentcl = ipgentcl.replace("$FPGAPART$", cfg._resolve_fpga_part())
    ipgentcl = ipgentcl.replace("$CLKPERIOD$", str(cfg.synth_clk_period_ns))
    ipgentcl = ipgentcl.replace("$DEFAULT_DIRECTIVES$", "")
    ipgentcl = ipgentcl.replace("$EXTRA_DIRECTIVES$", "config_export -format xo")
    # use Vitis RTL kernel (.xo) output instead of IP-XACT
    ipgentcl = ipgentcl.replace("export_design -format ip_catalog", "export_design -format xo")
    with open(wrapper_output_dir + "/hls_syn.tcl", "w") as f:
        f.write(ipgentcl)
    # build bash script to launch HLS synth and call it
    code_gen_dir = wrapper_output_dir
    builder = CallHLS()
    builder.append_tcl(code_gen_dir + "/hls_syn.tcl")
    builder.set_ipgen_path(code_gen_dir + "/{}".format(prjname))
    builder.build(code_gen_dir)
    ipgen_path = builder.ipgen_path
    assert os.path.isdir(ipgen_path), "HLS IPGen failed: %s not found" % (ipgen_path)
    ip_path = ipgen_path + "/sol1/impl/ip"
    assert os.path.isdir(ip_path), "HLS IPGen failed: %s not found. Check log under %s" % (
        ip_path,
        code_gen_dir,
    )
    xo_path = code_gen_dir + "/{}/sol1/impl/export.xo".format(prjname)
    xo_instr_path = xo_dir + "/instrumentation_wrapper.xo"
    shutil.copy(xo_path, xo_instr_path)

    return model


def test_step_export_xo(model, cfg):
    # Copy the generated .xo files to their respective Vitis IP directory
    result = subprocess.call(['cp', cfg.output_dir+"/xo/finn_design.xo", 'instr_wrap_platform/vitis/ip/finn_design/src'])
    result = subprocess.call(['cp', cfg.output_dir+"/xo/instrumentation_wrapper.xo", 'instr_wrap_platform/vitis/ip/instrumentation_wrapper/src'])
    return model


def test_step_build_platform(model, cfg):
    # Clean any previous/partial builds and then build full platform
    result = subprocess.call("cd instr_wrap_platform && make clean && make all", shell=True)
    return model


# Append the steps needed to build the platform
build_steps = build_cfg.default_build_dataflow_steps + [
    test_step_gen_vitis_xo,
    test_step_gen_instrumentation_wrapper,
    test_step_export_xo,
    test_step_build_platform,
]

build_steps.remove("step_specialize_to_rtl")

cfg = build.DataflowBuildConfig(
    steps=build_steps,
    board=platform_name,
    fpga_part=fpga_part,
    output_dir="output_%s_%s" % (model_name, platform_name),
    synth_clk_period_ns=3.3,
    folding_config_file="folding_config.json",
    stitched_ip_gen_dcp=False,
    generate_outputs=[
        build_cfg.DataflowOutputType.STITCHED_IP,
    ],
    save_intermediate_models=True,
)
model_file = "model.onnx"
build.build_dataflow_cfg(model_file, cfg)
""";

## Next steps
Once all the cells have finished running, the necessary builds will have been completed. **Due to a bug with Vitis XSCT tools, the instrumentation wrapper cannot be run from the notebook. It must be run from outside the notebook (e.g. from the command line that the Jupyter notebook server was started from).** \
\
The next notebook (`2-run_instr_wrap.ipynb`) will detail the process through which the instrumentation wrapper is run from the command line. However, the code cells will not function as intended, and are only placed to show the commands needed to run the instrumentation wrapper.