
# Python inference tutorial - Multi Process Service and Model Scheduler

This tutorial will walk you through the inference process using The Model Scheduler.

**Requirements:**

* Run HailoRT Multi-Process Service before running inference. See installation steps in [Multi-Process Service](../../inference/inference.rst)
* Run the notebook inside the Python virtual environment: ```source hailo_virtualenv/bin/activate```

It is recommended to use the command ``hailo tutorial`` (when inside the virtualenv) to open a Jupyter server that contains the tutorials.

## Running Inference using HailoRT

In this example we will use the Model Scheduler to run inference on multiple models.
Each model is represented by an HEF which is built using the Hailo Dataflow Compiler.
An HEF is Hailo's binary format for neural networks. The HEF files contain:

* Target HW configuration
* Weights
* Metadata for HailoRT (e.g. input/output scaling)

The Model Scheduler is an HailoRT component that comes to enhance and simplify the usage
of the same Hailo device by multiple networks. The responsibility for activating/deactivating the network
groups is now under HailoRT, and done **automatically** without user application intervention.
In order to use the Model Scheduler, create the VDevice with scheduler enabled, configure all models to the device, and start inference on all models:

In [None]:
import numpy as np
from multiprocessing import Process
from hailo_platform import (HEF, VDevice, HailoStreamInterface, InferVStreams, ConfigureParams,
    InputVStreamParams, OutputVStreamParams, InputVStreams, OutputVStreams, FormatType, HailoSchedulingAlgorithm)


# Define the function to run inference on the model
def infer(network_group, input_vstreams_params, output_vstreams_params, input_data):
    rep_count = 100
    with InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
        for i in range(rep_count):
            infer_results = infer_pipeline.infer(input_data)


# Loading compiled HEFs:
first_hef_path = '../hefs/resnet_v1_18.hef'
second_hef_path = '../hefs/shortcut_net.hef'
first_hef = HEF(first_hef_path)
second_hef = HEF(second_hef_path)
hefs = [first_hef, second_hef]

# Creating the VDevice target with scheduler enabled
params = VDevice.create_params()
params.scheduling_algorithm = HailoSchedulingAlgorithm.ROUND_ROBIN
with VDevice(params) as target:
    infer_processes = []

    # Configure network groups
    for hef in hefs:
        configure_params = ConfigureParams.create_from_hef(hef=hef, interface=HailoStreamInterface.PCIe)
        network_groups = target.configure(hef, configure_params)
        network_group = network_groups[0]

        # Create input and output virtual streams params
        # Quantized argument signifies whether or not the incoming data is already quantized.
        # Data is quantized by HailoRT if and only if quantized == False.
        input_vstreams_params = InputVStreamParams.make(network_group, quantized=False, format_type=FormatType.FLOAT32)
        output_vstreams_params = OutputVStreamParams.make(network_group, quantized=True, format_type=FormatType.UINT8)

        # Define dataset params
        input_vstream_info = hef.get_input_vstream_infos()[0]
        image_height, image_width, channels = input_vstream_info.shape
        num_of_frames = 10
        low, high = 2, 20

        # Generate random dataset
        dataset = np.random.randint(low, high, (num_of_frames, image_height, image_width, channels)).astype(np.float32)
        input_data = {input_vstream_info.name: dataset}

        # Create infer process
        infer_process = Process(target=infer, args=(network_group, input_vstreams_params, output_vstreams_params, input_data))
        infer_processes.append(infer_process)

    print(f'Starting streaming on multiple models using scheduler')
    for infer_process in infer_processes:
        infer_process.start()
    for infer_process in infer_processes:
        infer_process.join()

    print('Done inference')