# Parsing Tutorial

## Hailo Parsing Examples from TensorFlow/Pytorch to HAR

This tutorial describes the steps for parsing models from various frameworks to the HAR format (Hailo Archive).  
HAR is a tar.gz archive file that contains the representation of the graph structure and the weights that are deployed to Hailo's runtime.

Note:
**Running this code in Jupyter notebook is recommended**, see the Introduction tutorial for more details.

Note:
This section demonstrates the Python APIs for Hailo Parser.
You could also use the CLI: try `hailo parser {tf, onnx} --help`.  
More details on Dataflow Compiler User Guide / Building Models / Profiler and other command line tools.

In [2]:
# General imports used throughout the tutorial
import tensorflow as tf
from IPython.display import SVG

# import the ClientRunner class from the hailo_sdk_client package
from hailo_sdk_client import ClientRunner

Set the hardware architecture to be used throughout the tutorial:

In [3]:
chosen_hw_arch = "hailo8"
# For Hailo-15 devices, use 'hailo15h'
# For Mini PCIe modules or Hailo-8R devices, use 'hailo8r'

In [9]:
!pip install ultralytics
from ultralytics import YOLO
YOLO('yolo11n.pt').export(format='onnx', opset=14)

Collecting ultralytics
  Downloading ultralytics-8.3.83-py3-none-any.whl.metadata (35 kB)
Collecting torchvision>=0.9.0 (from ultralytics)
  Downloading torchvision-0.21.0-cp310-cp310-manylinux1_x86_64.whl.metadata (6.1 kB)
Collecting seaborn>=0.11.0 (from ultralytics)
  Downloading seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.14-py3-none-any.whl.metadata (9.4 kB)
Collecting torch>=1.8.0 (from ultralytics)
  Downloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl.metadata (28 kB)
Collecting nvidia-cusparselt-cu12==0.6.2 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cusparselt_cu12-0.6.2-py3-none-manylinux2014_x86_64.whl.metadata (6.8 kB)
Collecting triton==3.2.0 (from torch>=1.8.0->ultralytics)
  Downloading triton-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.4 kB)
Downloading ultralytics-8.3.83-py3-none-any.whl (922 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━

Downloading torchvision-0.21.0-cp310-cp310-manylinux1_x86_64.whl (7.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m36.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl (766.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m766.7/766.7 MB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:02[0m
[?25hDownloading nvidia_cusparselt_cu12-0.6.2-py3-none-manylinux2014_x86_64.whl (150.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m150.1/150.1 MB[0m [31m18.6 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading triton-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (253.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m253.1/253.1 MB[0m [31m19.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading ultralytics_thop-2.0.14-py3-none-any.whl (26 kB)
Installing collected packages: triton, nvidia-cusparselt-

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.35M/5.35M [00:00<00:00, 30.0MB/s]


Ultralytics 8.3.83 🚀 Python-3.10.12 torch-2.6.0+cu124 CPU (12th Gen Intel Core(TM) i7-12700)
YOLO11n summary (fused): 100 layers, 2,616,248 parameters, 0 gradients, 6.5 GFLOPs

[34m[1mPyTorch:[0m starting from 'yolo11n.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 84, 8400) (5.4 MB)
[31m[1mrequirements:[0m Ultralytics requirement ['onnxslim'] not found, attempting AutoUpdate...
Collecting onnxslim
  Downloading onnxslim-0.1.48-py3-none-any.whl.metadata (4.6 kB)
Downloading onnxslim-0.1.48-py3-none-any.whl (142 kB)
Installing collected packages: onnxslim
Successfully installed onnxslim-0.1.48

[31m[1mrequirements:[0m AutoUpdate success ✅ 1.1s, installed 1 package: ['onnxslim']
[31m[1mrequirements:[0m ⚠️ [1mRestart runtime or rerun command for updates to take effect[0m


[34m[1mONNX:[0m starting export with onnx 1.16.0 opset 14...



[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: pip install --upgrade pip


[34m[1mONNX:[0m slimming with onnxslim 0.1.48...
[34m[1mONNX:[0m export success ✅ 1.9s, saved as 'yolo11n.onnx' (10.2 MB)

Export complete (3.2s)
Results saved to [1m/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_tutorials/notebooks[0m
Predict:         yolo predict task=detect model=yolo11n.onnx imgsz=640  
Validate:        yolo val task=detect model=yolo11n.onnx imgsz=640 data=/usr/src/ultralytics/ultralytics/cfg/datasets/coco.yaml  
Visualize:       https://netron.app


'yolo11n.onnx'

## Parsing Example from ONNX to HAR

Choose the ONNX file to be used throughout the example:

In [10]:
onnx_model_name = "yolo11n"
onnx_path = "../models/yolo11n.onnx"

The main API of the Dataflow Compiler that the user interacts with is the ClientRunner class (see the API Reference section on the Dataflow Compiler user guide for more information).  

Initialize a ClientRunner and use the translate_onnx_model method. 

Arguments:

* model_path
* model_name to use
* start_node_names (list of str, optional): Name of the first ONNX node to parse.
* end_node_names (list of str, optional): List of ONNX nodes, that the parsing can stop after all of them are parsed.
* net_input_shapes (dict, optional): A dictionary describing the input shapes for each of the start nodes given in start_node_names, where the keys are the names of the start nodes and the values are their corresponding input shapes. Use only when the original model has dynamic input shapes (described with a wildcard
denoting each dynamic axis, e.g. [b, c, h, w]). 

As a suggestion try translating the ONNX model without supplying the optional arguments.

In [13]:
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_onnx_model(
    onnx_path,
    onnx_model_name,
    start_node_names=["/model.0/conv/Conv"],
    end_node_names=["/model.23/cv3.2/cv3.2.2/Conv", "/model.23/cv2.2/cv2.2.2/Conv",
                   "/model.23/cv2.1/cv2.1.2/Conv", "/model.23/cv3.1/cv3.1.2/Conv",
                   "/model.23/cv2.0/cv2.0.2/Conv", "/model.23/cv3.0/cv3.0.2/Conv"],
    net_input_shapes={"/model.0/conv/Conv": [1, 3, 640, 640]},
)

[info] Translation started on ONNX model yolo11n
[info] Restored ONNX model yolo11n (completion time: 00:00:00.04)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.15)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv3.2/cv3.2.2/Conv /model.23/cv2.2/cv2.2.2/Conv.
[info] Start nodes mapped from original model: 'images': 'yolo11n/input_layer1'.
[info] End nodes mapped from original model: '/model.23/cv3.2/cv3.2.2/Conv', '/model.23/cv2.2/cv2.2.2/Conv', '/model.23/cv2.1/cv2.1.2/Conv', '/model.23/cv3.1/cv3.1.2/Conv', '/model.23/cv2.0/cv2.0.2/Conv', '/model.23/cv3.0/cv3.0.2/Conv'.
[info] Translation completed on ONNX model yolo11n (completion time: 00:00:00.73)


## Hailo Archive

Hailo Archive is a tar.gz archive file that captures the "state" of the model - the files and attributes used in a given stage from parsing to compilation.
Use the `save_har` method to save the runner's state in any stage and `load_har` method to load a saved state to an uninitialized runner.

The initial HAR file includes:
- HN file, which is a JSON-like representation of the graph structure that is deployed to the Hailo hardware.
- NPZ file, which includes the weights of the model.

Save the parsed model in a Hailo Archive file:

In [14]:
hailo_model_har_name = f"{onnx_model_name}_hailo_model.har"
runner.save_har(hailo_model_har_name)

[info] Saved HAR to: /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_tutorials/notebooks/yolo11n_hailo_model.har


Visualize the graph with Hailo’s visualizer tool:

In [None]:
!hailo visualizer {hailo_model_har_name} --no-browser
SVG("resnet_v1_18.svg")

## Parsing Example from TensorFlow Lite
The Hailo parser supports inference models as inputs, therefore we advise to use TensorFlow Lite representation for TensorFlow 2 models (TF2 SavedModel format is commonly used for training models).  

Parsing the TensorFlow Lite format is similar to parsing ONNX models.  
The parser identifies the input format automatically.

The following example shows how to parse a TensorFlow Lite model, using a different model.

In [None]:
model_name = "dense_example"
model_path = "../models/v3-large-minimalistic_224_1.0_float.tflite"

runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_tf_model(model_path, model_name)

## Common Conversion Methods from Tensorflow to Tensorflow Lite
The following examples focus on Tensorflow's TFLite converter support for various TF formats, showing
how older formats of TF can be converted to TFLite, which can then be used in Hailo's parsing stage.

In [None]:
# Build a simple Keras model and convert it to tflite

# Building a simple Keras model
def build_small_example_net():
    inputs = tf.keras.Input(shape=(24, 24, 96), name="img")
    x = tf.keras.layers.Conv2D(24, 1, name="conv1")(inputs)
    x = tf.keras.layers.BatchNormalization(momentum=0.9, name="bn1")(x)
    outputs = tf.keras.layers.ReLU(max_value=6.0, name="relu1")(x)
    model = tf.keras.Model(inputs, outputs, name="small_example_net")
    return model


# Converting the Model to tflite
model = build_small_example_net()
model_name = "small_example"
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,  # enable TensorFlow Lite ops.
    tf.lite.OpsSet.SELECT_TF_OPS,  # enable TensorFlow ops.
]
tflite_model = converter.convert()  # may cause warnings in jupyter notebook, don't worry.
tflite_model_path = "../models/small_example.tflite"
with tf.io.gfile.GFile(tflite_model_path, "wb") as f:
    f.write(tflite_model)

# Parsing the model to Hailo format
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_tf_model(tflite_model_path, model_name)

In [None]:
# Alternatively, convert an already saved SavedModel to tflite
model_path = "../models/dense_example_tf2/"
model_name = "dense_example_tf2"
converter = tf.lite.TFLiteConverter.from_saved_model(model_path)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,  # enable TensorFlow Lite ops.
    tf.lite.OpsSet.SELECT_TF_OPS,  # enable TensorFlow ops.
]
tflite_model = converter.convert()  # may cause warnings in jupyter notebook, don't worry.
tflite_model_path = "../models/dense_example_tf2.tflite"
with tf.io.gfile.GFile(tflite_model_path, "wb") as f:
    f.write(tflite_model)

# Parsing the model to Hailo format
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_tf_model(tflite_model_path, model_name)

In [None]:
# Third option, convert h5 file to tflite.
model_path = "../models/ew_sub_v0.h5"
model_name = "ew_sub_example"
model = tf.keras.models.load_model(model_path)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,  # enable TensorFlow Lite ops.
    tf.lite.OpsSet.SELECT_TF_OPS,  # enable TensorFlow ops.
]
tflite_model = converter.convert()
tflite_model_path = "../models/ew_sub_example.tflite"
with tf.io.gfile.GFile(tflite_model_path, "wb") as f:
    f.write(tflite_model)

# Parsing the model to Hailo format
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_tf_model(tflite_model_path, model_name)