# Part 2: onnx2versal

**This notebook contains only CLI commands and their outputs. Recommend running outside of notebook.**

Requirements:
* Vitis Software Platform 2022.2
* xilinx-versal-common image
* X86 XRT (only for software system emulation)
* Python 3

Inputs:
* Onnx model
* Data .npy file, assumes first dimension is batch

Outputs:
* Quantized ONNX model
* AIE project files

## Generate quantized ONNX model
Quantizing the model improves performance at very slight expense of accuracy. Run the following to quantize the onnx model using ORT quantization pipeline. Ensure DATA_NPY_PATH contains input data of the correct data type. For more details see https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html.

In [16]:
ONNX_PATH = "../models/hls4ml_jettag.onnx"
ONNX_INFER_PATH = "../models/hls4ml_jettag_infer.onnx"
DATA_NPY_PATH = "../data/hls4ml_jettag/X_test.npy"

! python -m onnxruntime.quantization.preprocess --input {ONNX_PATH} --output {ONNX_INFER_PATH}
! python quantize_onnx.py {ONNX_INFER_PATH} {QUANTIZED_ONNX_PATH} {DATA_NPY_PATH}

Calibrated and quantized model saved.
../models/hls4ml_jettag_infer.onnx took: 59.69ms
../models/hls4ml_jettag_int8.onnx took: 111.68ms
Matched elements: 14397 / 830000 (1.7345783132530121%)
Max absolute difference: 0.6150588989257812
Max relative difference: 13.653030395507812

Running argmax on last dim...
Matched elements: 158786 / 166000 (95.65421686746988%)


## Generate AIE files

In [3]:
! python generate.py {ONNX_PATH} {DATA_NPY_PATH}

Saving tensor of shape (1, 16) into ../data/fc1_input.txt
Saving tensor of shape (1, 16) into ../data/k0gemm_in_shape1x16.txt
Saving tensor of shape (1, 64) into ../data/k0gemm_goldenout_shape1x64.txt
Saving tensor of shape (1, 64) into ../data/k3gemm_in_shape1x64.txt
Saving tensor of shape (1, 32) into ../data/k3gemm_goldenout_shape1x32.txt
Saving tensor of shape (1, 32) into ../data/k6gemm_in_shape1x32.txt
Saving tensor of shape (1, 32) into ../data/k6gemm_goldenout_shape1x32.txt
Padding Gemm tw (32, 5) to (32, 8)
Padding Gemm tbias (5,) to (8,)
Saving tensor of shape (1, 32) into ../data/k9gemm_in_shape1x32.txt
Padding to write (5,) to (6,)
Saving tensor of shape (6,) into ../data/k9gemm_goldenout_shape1x5.txt
Padding SoftmaxOp tin (1, 5) to (1, 8)
Saving tensor of shape (1, 8) into ../data/k11softmax_in_shape1x8.txt
Padding to write (5,) to (6,)
Saving tensor of shape (6,) into ../data/k11softmax_goldenout_shape1x5.txt
Disabled input padding for k0gemm, may result in choosing scala

## Verify functionality, simulator profiling

The following commands require Vitis installation (aiecompiler, v++, QEMU). Type `make help` to see options. Make recipes can be executed separately, e.g. `TARGET=sw_emu GRAPH=hls4ml_jettag make graph`. The following commands chain the recipes.

For float32 

In [None]:
ONNX_PATH = "../models/hls4ml_jettag.onnx"
MODEL_NAME = "hls4ml_jettag"

For int8

In [None]:
ONNX_PATH = "../models/hls4ml_jettag_int8.onnx"
MODEL_NAME = "hls4ml_jettag_int8"

### Functional test
Compiles kernels as pthreads, runs graph on x86simulator.

In [5]:
! cd ../ && source sample_env_setup.sh && TARGET=sw_emu GRAPH={MODEL_NAME} make graph aiesim

Aiecompiler:    /tools/Xilinx/Vitis/2022.2/aietools/bin/aiecompiler
Vivado:         /tools/Xilinx/Vivado/2022.2/bin/vivado
Vitis:          /tools/Xilinx/Vitis/2022.2/bin/vitis
Vitis HLS:      /tools/Xilinx/Vitis_HLS/2022.2/bin/vitis_hls
XILINX_X86_XRT: /opt/xilinx/xrt
mkdir -p /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu; \
cd /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu; \
aiecompiler -include=/home/ruien/workspace/onnx2versal/design/aie_src --verbose --Xpreproc="-DITER_CNT=1" --Xchess="main:backend.mist2.maxfoldk=256" --platform=/tools/Xilinx/Vitis/2022.2/base_platforms/xilinx_vck190_base_202220_1/xilinx_vck190_base_202220_1.xpfm --log-level=5 --pl-freq=500 --dataflow --stacksize=2048 --heapsize=2048 --workdir=Work --target=x86sim --Xpreproc=-O0 --Xpreproc=-D__LOG_VERBOSE__ --Xpreproc=-D__OUTPUT_INTER__ /home/ruien/workspace/onnx2versal/design/aie_src/graph_hls4ml_jettag.cpp 2>&1 | tee -a aiecompiler.log
aietools : /tools/Xilinx/Vitis/2022.2/aietools
I

### System functional test
Compiles AIE kernels and application code into x86 models, PL kernels into SysC models, and run them on x86 system simulator.

In [6]:
! cd ../ && source sample_env_setup.sh && TARGET=sw_emu GRAPH={MODEL_NAME} make graph kernels xsa application package clean_reports run_emu

Aiecompiler:    /tools/Xilinx/Vitis/2022.2/aietools/bin/aiecompiler
Vivado:         /tools/Xilinx/Vivado/2022.2/bin/vivado
Vitis:          /tools/Xilinx/Vitis/2022.2/bin/vitis
Vitis HLS:      /tools/Xilinx/Vitis_HLS/2022.2/bin/vitis_hls
XILINX_X86_XRT: /opt/xilinx/xrt
make: Nothing to be done for 'graph'.
mkdir -p /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu; cd /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu; \
v++ --target sw_emu --platform /tools/Xilinx/Vitis/2022.2/base_platforms/xilinx_vck190_base_202220_1/xilinx_vck190_base_202220_1.xpfm --save-temps --temp_dir /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu/_x --verbose -g -c --hls.clock 312500000:s2mm -k s2mm /home/ruien/workspace/onnx2versal/design/pl_src/float_s2mm.cpp -o /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/sw_emu/s2mm.sw_emu.xo
Option Map File Used: '/tools/Xilinx/Vitis/2022.2/data/vitis/vpp/optMap.xml'

****** v++ v2022.2.2 (64-bit)
  **** SW Build 3716524 on 2023-

### Performance test
Compiles AIE Kernels into SysC models run using cycle-accurate aiesimulator.

In [7]:
! cd ../ && source sample_env_setup.sh && TARGET=hw_emu GRAPH={MODEL_NAME} make graph aiesim_profile

Aiecompiler:    /tools/Xilinx/Vitis/2022.2/aietools/bin/aiecompiler
Vivado:         /tools/Xilinx/Vivado/2022.2/bin/vivado
Vitis:          /tools/Xilinx/Vitis/2022.2/bin/vitis
Vitis HLS:      /tools/Xilinx/Vitis_HLS/2022.2/bin/vitis_hls
XILINX_X86_XRT: /opt/xilinx/xrt
mkdir -p /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/hw_emu; \
cd /home/ruien/workspace/onnx2versal/build/hls4ml_jettag/hw_emu; \
aiecompiler -include=/home/ruien/workspace/onnx2versal/design/aie_src --verbose --Xpreproc="-DITER_CNT=1" --Xchess="main:backend.mist2.maxfoldk=256" --platform=/tools/Xilinx/Vitis/2022.2/base_platforms/xilinx_vck190_base_202220_1/xilinx_vck190_base_202220_1.xpfm --log-level=5 --pl-freq=500 --dataflow --stacksize=2048 --heapsize=2048 --workdir=Work --Xpreproc=-D__LOG_VERBOSE__ --Xpreproc=-D__OUTPUT_INTER__ /home/ruien/workspace/onnx2versal/design/aie_src/graph_hls4ml_jettag.cpp 2>&1 | tee -a aiecompiler.log
aietools : /tools/Xilinx/Vitis/2022.2/aietools
INFO: [aiecompiler 77-3355] ###[

### System performance test
Compiles AIE kernels, PL kernels, into SysC models and use existing SysC models of NoC, DDR, CIPS, PS, AXI4 and other components of Versal platform. Runs in QEMU.