# Setup

## Install, imports, names

Run the notebook from a virtualenv prepared like so:

1. Install CUDA and Pytorch. Tested configs:
    1. `pip install torch==1.12.1+cu113` (assuming CUDA 11.3) **OR**
    1. `pip install torch=1.12.1+cu102` (assuming CUDA 10.2)
1. Clone & install OpenPCDet: (tested w. commit a68aaa656 04-Apr-23) 
    ```
     pip install -r requirements
     pip install spconv kornia
     python setup.py develop
   ```
1. Install Mayavi for 3D visualization of point clouds and 3D boxes. Installing and using Mayavi and its dependencies (PyQT5) might be tricky, especially working remotely on a server, so in the notebook code we create visual results as png files without any windows. Still, it might help to prepend the *jupyter-notebook* launch command with instructions to skip checking for gui toolkit (s.a. Qt) and, if no screen ("headless"), also creating and working against a virtual display, like so
```
ETS_TOOLKIT=null xvfb-run --server-args="-screen 0 1024x768x24" jupyter notebook ...
```

1. Download pretrained PointPillar pytorch model from the link below into your *openpcdet-clone-location*:
[https://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/view?usp=sharing]

In [1]:
# openpcdet clone path - replace with yours
openpcdet_clonedir = '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy'  

In [2]:
cd /home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy

/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy


In [3]:
import pcdet, torch, sys
sys.path.append(openpcdet_clonedir+'/tools/')
pcdet.__version__, torch.__version__

('0.6.0+0000000', '2.3.0+cu118')

Make sure you are in the the currect folder (pointpillars):

In [4]:
cd /home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/mmlab_hailo/Hailo-Innoviz-HW-Offload/model_compilation/pointpillars

/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/mmlab_hailo/Hailo-Innoviz-HW-Offload/model_compilation/pointpillars


In [5]:
from pathlib import Path

from pcdet.config import cfg, cfg_from_yaml_file
from pcdet.models import build_network
from pcdet.utils import common_utils

import os
from importlib import reload
import openpcdet2hailo_utils as ohu; reload(ohu)

<module 'openpcdet2hailo_utils' from '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/mmlab_hailo/Hailo-Innoviz-HW-Offload/model_compilation/pointpillars/openpcdet2hailo_utils.py'>

Specify the path to the relevant model's .yaml file and the path to the model's .pth weights file: 

In [6]:
yaml_name = openpcdet_clonedir+'/tools/cfgs/custom_models/point_pillar_best.yaml'
har_name = 'pp_bev_w_head.har'
hef_name = har_name.replace('.har', '.hef')
logger = common_utils.create_logger()

# (!) pre-trained model:
pth_name ='/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/Evaluation/pointpillar_evaluations/models_outputs/point_pillar_best/default/ckpt/checkpoint_epoch_80.pth'  

Here below I load the point cloud samples I have.

This can be done in any way - the important thing is to have a directory under 'pointpillar' folder with the point clouds. By defualt it expects .npy files but you can also try other formats just remember to change the pc_file_extention variable below. 

Also - specify the path to the point cloud sample (under sample_pointclouds) and give one point cloud example (under demo_pointcloud).

In [7]:
# (!) get sample pointclouds (..running once..) 
# specify the path to the pointclouds:
file_names_path = '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/data/custom/ImageSets/val.txt'
source_folder_npy = '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/data/custom/points'
destination_folder = '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/mmlab_hailo/Hailo-Innoviz-HW-Offload/model_compilation/pointpillars/results'
pc_file_extention= '.npy'

# creates the samples in a new folder named "pc_samples"
ohu.get_pc_samples(file_names_path, source_folder_npy, destination_folder)

sample_pointclouds = './pc_samples/testing/innoviz/'
demo_pointcloud = sample_pointclouds+'00001.npy'

In [8]:
def get_model(cfg, pth_name, demo_dataset):    
    model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=demo_dataset)
    model.load_params_from_file(filename=pth_name, logger=logger, to_cpu=True)
    model.eval()
    return model

def cfg_from_yaml_file_wrap(yaml_name, cfg):
    cwd = os.getcwd()
    os.chdir(openpcdet_clonedir+'/tools/')
    cfg_from_yaml_file(yaml_name, cfg)
    os.chdir(cwd)

In [9]:
import numpy as np
import hailo_sdk_client
print(hailo_sdk_client.__version__)

3.24.0



TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 



# Running end2end with 2D part offloaded to Hailo HW

```
We will be using HailoRT asynchronous send/receive to demonstrate readiness for a fully pipelined processing.

Therefore, we integrate the Torch and Hailo parts differently from what we did for the emulation testing. We rip off some code from original forward() to build two torch models which encapsulate everything that happens before and after the module offloaded to Hailo.  See the code for 'PP_Pre_Bev_w_Head', 'PP_Post_Bev_w_Head' classes in the accompanying utils file.
```

In [10]:
reload(ohu);  # PP_Pre_Bev_w_Head, PP_Post_Bev_w_Head, 

In [11]:
""" ================================================
    Standard wrapping of HailoRT API with send/receive processes, 
    similar to other Hailo examples (or HailoRT tutorial)
    ================================================
"""
from multiprocessing import Process, Queue
from hailo_platform import (HEF, PcieDevice, VDevice, HailoStreamInterface, ConfigureParams,
 InputVStreamParams, OutputVStreamParams, InputVStreams, OutputVStreams, FormatType)

def send_from_queue(configured_network, read_q, num_images, start_time):
    """ Bridging a queue into Hailo platform FEED. To be run as a separate process. 
        Reads (preprocessed) images from a given queue, and sends them serially to Hailo platform.        
    """    
    configured_network.wait_for_activation(1000)
    vstreams_params = InputVStreamParams.make(configured_network, quantized=False, format_type=FormatType.FLOAT32)
    print('Starting sending input images to HW inference...\n')
    with InputVStreams(configured_network, vstreams_params) as vstreams:
        vstream_to_buffer = {vstream: np.ndarray([1] + list(vstream.shape), dtype=vstream.dtype) for vstream in vstreams}
        for i in range(num_images):
            hailo_inp = read_q.get()
            for vstream, _ in vstream_to_buffer.items():                                
                vstream.send(hailo_inp)
            print(f'sent img #{i}')
    print(F'Finished send after {(time.time()-start_time) :.1f}')
    return 0

def recv_to_queue(configured_network, write_q, num_images, start_time):
    """ Bridging Hailo platform OUTPUT into a queue. To be run as a separate process. 
        Reads output data from Hailo platform and sends them serially to a given queue.
    """
    configured_network.wait_for_activation(1000)
    vstreams_params = OutputVStreamParams.make_from_network_group(configured_network, quantized=False, format_type=FormatType.FLOAT32)
    print('Starting receving HW inference output..\n')
    with OutputVStreams(configured_network, vstreams_params) as vstreams:
        # print('vstreams_params', vstreams_params)
        for i in range(num_images):            
            hailo_out = {vstream.name: np.expand_dims(vstream.recv(), 0) for vstream in vstreams}    
            
            print("hailo_out keys:", hailo_out.keys())
                      
            write_q.put(hailo_out)
            print(f'received img #{i}')
    print(F'Finished recv after {time.time()-start_time :.1f}')
    return 0

""" ==================================
    Some final wrapping of pre and post 
    ==================================
"""
# Modify generate_data_dicts to include sample_name
def generate_data_dicts(demo_dataset, num_images, pp_pre_bev_w_head):
    for idx, data_dict in enumerate(demo_dataset):
        if idx > num_images:
            break
        data_dict = demo_dataset.collate_batch([data_dict])
        ohu.load_data_to_CPU(data_dict)
        # Add sample_name to data_dict with only the file name
        data_dict['sample_name'] = os.path.basename(demo_dataset.sample_file_list[idx])
        # ------ (!) Applying torch PRE-processing -------
        data_dict = pp_pre_bev_w_head.forward(data_dict)
        # ------------------------------------------------
        logger.info(f'preprocessed sample #{idx}')
        yield data_dict

# Pass the sample_name through the pipeline
def generate_hailo_inputs(demo_dataset, num_images, pp_pre_bev_w_head):
    """ generator-style encapsulation for preprocessing inputs for Hailo HW feed
    """
    for data_dict in generate_data_dicts(demo_dataset, num_images, pp_pre_bev_w_head):
        spatial_features = data_dict['spatial_features']
        spatial_features_hailoinp = np.transpose(spatial_features.cpu().detach().numpy(), (0, 2, 3, 1))
        yield data_dict, spatial_features_hailoinp

# Attach the sample_name to the prediction dictionaries in post_proc_from_queue
def post_proc_from_queue(recv_queue, num_images, pp_post_bev_w_head,
                         output_layers_order=['model/concat1', 'model/conv19', 'model/conv18', 'model/conv20']):
    results = []
    for i in range(num_images):
        t_ = time.time()
        while(recv_queue.empty() and time.time()-t_ < 3):
            time.sleep(0.01)
        if recv_queue.empty():
            print("RECEIVE TIMEOUT!")
            break
        hailo_out = recv_queue.get(0)
        bev_out = (hailo_out[lname] for lname in output_layers_order)
        
        # ------ (!) Applying torch POST-processing -------
        pred_dicts, _ = pp_post_bev_w_head(bev_out)
        # ------------------------------------------------
        # Add sample_name to each prediction dictionary
        sample_name = recv_queue.sample_names[i]
        for pred_dict in pred_dicts:
            pred_dict['sample_name'] = sample_name
        results.append(pred_dicts)
    return results

In [23]:
# import time, onnxruntime

# data_source = demo_pointcloud  # replace by a folder for a more serious test
# num_images = 1
# # pc_samples = './pc_samples/testing/innoviz/'
# # data_source = pc_samples
# # num_images = 51

# cfg_from_yaml_file_wrap(yaml_name, cfg)
# logger = common_utils.create_logger()
# demo_dataset = ohu.DemoDataset(
#     dataset_cfg=cfg.DATA_CONFIG, class_names=cfg.CLASS_NAMES, training=False,
#     root_path=Path(data_source), ext=pc_file_extention, logger=logger
#     )
# model = get_model(cfg, pth_name, demo_dataset)

# # Library creates the anchors in cuda by defalt (applying .cuda() in internal implementation)
# model.dense_head.anchors = [anc.cpu() for anc in model.dense_head.anchors]

# """ (!) Slicing off the torch model all that happens before and after Hailo
# """

# pp_pre_bev_w_head = ohu.PP_Pre_Bev_w_Head(model)
# pp_post_bev_w_head = ohu.PP_Post_Bev_w_Head(model)
    
# # Adjusting the processing loop to handle sample_name
# with VDevice() as target:
#     hef = HEF(hef_name)
#     configure_params = ConfigureParams.create_from_hef(hef, interface=HailoStreamInterface.PCIe)
#     network_group = target.configure(hef, configure_params)[0]
#     network_group_params = network_group.create_params()
#     recv_queue = Queue()
#     send_queue = Queue()
#     start_time = time.time()
#     results = []
#     hw_send_process = Process(target=send_from_queue, args=(network_group, send_queue, num_images, start_time))
#     hw_recv_process = Process(target=recv_to_queue, args=(network_group, recv_queue, num_images, start_time))

#     # List to keep track of sample names
#     sample_names = []

#     with network_group.activate(network_group_params):
#         hw_recv_process.start()
#         hw_send_process.start()

#         for data_dict, hailo_inp in generate_hailo_inputs(demo_dataset, num_images, pp_pre_bev_w_head):
#             tik = time.time()

#             send_queue.put(hailo_inp)
#             # Add sample_name to the list
#             sample_names.append(data_dict['sample_name'])

#         # # Attach sample names to the queue
#         recv_queue.sample_names = sample_names

#         results = post_proc_from_queue(recv_queue, num_images, pp_post_bev_w_head)

#         tok = time.time()
#         elapsed_time = tok - tik
#         print(f"Elapsed time: {elapsed_time/num_images} seconds")
                             
#     hw_recv_process.join(10)
#     hw_send_process.join(10)
    
#     pred_dicts = results[-1]
#     print(pred_dicts[0]['pred_scores'])

In [12]:
import time, onnxruntime

data_source = demo_pointcloud  # replace by a folder for a more serious test
num_images = 1
# pc_samples = './pc_samples/testing/innoviz/'
# data_source = pc_samples
# num_images = 51

cfg_from_yaml_file_wrap(yaml_name, cfg)
logger = common_utils.create_logger()
demo_dataset = ohu.DemoDataset(
    dataset_cfg=cfg.DATA_CONFIG, class_names=cfg.CLASS_NAMES, training=False,
    root_path=Path(data_source), ext=pc_file_extention, logger=logger
)
model = get_model(cfg, pth_name, demo_dataset)

# Library creates the anchors in cuda by default (applying .cuda() in internal implementation)
model.dense_head.anchors = [anc.cpu() for anc in model.dense_head.anchors]

""" (!) Slicing off the torch model all that happens before and after Hailo
"""

pp_pre_bev_w_head = ohu.PP_Pre_Bev_w_Head(model)
pp_post_bev_w_head = ohu.PP_Post_Bev_w_Head(model)
    
# Adjusting the processing loop to handle sample_name
with VDevice() as target:
    hef = HEF(hef_name)
    configure_params = ConfigureParams.create_from_hef(hef, interface=HailoStreamInterface.PCIe)
    network_group = target.configure(hef, configure_params)[0]
    network_group_params = network_group.create_params()
    recv_queue = Queue()
    send_queue = Queue()
    start_time = time.time()
    results = []
    hw_send_process = Process(target=send_from_queue, args=(network_group, send_queue, num_images, start_time))
    hw_recv_process = Process(target=recv_to_queue, args=(network_group, recv_queue, num_images, start_time))

    # List to keep track of sample names
    sample_names = []

    with network_group.activate(network_group_params):
        hw_recv_process.start()
        hw_send_process.start()

        # Start timing before the loop
        tik = time.time()

        for data_dict, hailo_inp in generate_hailo_inputs(demo_dataset, num_images, pp_pre_bev_w_head):
            send_queue.put(hailo_inp)
            # Add sample_name to the list
            sample_names.append(data_dict['sample_name'])

        # Attach sample names to the queue
        recv_queue.sample_names = sample_names

        results = post_proc_from_queue(recv_queue, num_images, pp_post_bev_w_head)

        # Stop timing after processing
        tok = time.time()
        elapsed_time = tok - tik
        average_time_per_image = elapsed_time / num_images
        inference_rate_hz = num_images / elapsed_time

        print(f"Total elapsed time: {elapsed_time:.4f} seconds")
        print(f"Average time per image: {average_time_per_image:.4f} seconds")
        print(f"Inference rate: {inference_rate_hz:.2f} Hz")
                             
    hw_recv_process.join(10)
    hw_send_process.join(10)
    
    pred_dicts = results[-1]
    print(pred_dicts[0]['pred_scores'])


  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
2024-10-22 15:02:30,537   INFO  ==> Loading parameters from checkpoint /home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/Evaluation/pointpillar_evaluations/models_outputs/point_pillar_best/default/ckpt/checkpoint_epoch_80.pth to CPU
2024-10-22 15:02:30,537   INFO  ==> Loading parameters from checkpoint /home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/Evaluation/pointpillar_evaluations/models_outputs/point_pillar_best/default/ckpt/checkpoint_epoch_80.pth to CPU
2024-10-22 15:02:30,560   INFO  ==> Checkpoint trained from version: pcdet+0.6.0+0000000
2024-10-22 15:02:30,560   INFO  ==> Checkpoint trained from version: pcdet+0.6.0+0000000
2024-10-22 15:02:30,565   INFO  ==> Done (loaded 127/127)
2024-10-22 15:02:30,565   INFO  ==> Done (loaded 127/127)


Starting receving HW inference output..

Starting sending input images to HW inference...



2024-10-22 15:02:30,850   INFO  preprocessed sample #0
2024-10-22 15:02:30,850   INFO  preprocessed sample #0


sent img #0
Finished send after 0.3
hailo_out keys: dict_keys(['model/conv20', 'model/conv19', 'model/conv18', 'model/concat1'])
received img #0
Finished recv after 0.4
(1, 248, 216, 6) <class 'numpy.ndarray'> (1, 248, 216, 42)


AssertionError: 

In [None]:
print(results)

Creating a 'prediction' folder under output_path containing the results in .txt format for further analysis:

In [36]:
def save_predictions(results, output_path):
    # Create the prediction folder if it doesn't exist
    prediction_folder = os.path.join(output_path, "predictions")
    os.makedirs(prediction_folder, exist_ok=True)
    
    for sample_predictions in results:
        for prediction in sample_predictions:
            sample_name = prediction['sample_name']
            file_name = os.path.splitext(sample_name)[0] + ".txt"
            file_path = os.path.join(prediction_folder, file_name)
            
            with open(file_path, 'w') as file:
                pred_boxes = prediction['pred_boxes'].cpu().numpy()
                pred_scores = prediction['pred_scores'].cpu().numpy()
                for box, score in zip(pred_boxes, pred_scores):
                    line = ' '.join(map(str, box)) + f' {score}\n'
                    file.write(line)

In [53]:
output_path = '/home/tauproj1/Innoviz_Project/OpenPCDet-master_New_copy/3D_Object_Detector/mmlab_hailo/Hailo-Innoviz-HW-Offload/model_compilation/pointpillars'  # Specify your desired output path here
save_predictions(results, output_path)