# PeopleNet Deployment for Infineon PSOC EDGE Devices

This notebook demonstrates how to deploy NVIDIA's pre-trained PeopleNet model to Infineon PSOC EDGE devices. PeopleNet is a highly optimized deep learning model for detecting and tracking people in images and video streams, making it ideal for edge applications like occupancy monitoring, crowd analysis, and security systems.

By following this workflow, you'll learn how to obtain a quantized PeopleNet model and convert it for efficient execution on Infineon's PSOC EDGE hardware with Arm Ethos-U55 NPU acceleration.

**Note:** This notebook requires an NVIDIA GPU for the conversion process.

## 1. Environment Setup

First, we'll set up the environment variables needed for our workflow. These variables define paths for data, experiments, and specify GPU resources.

The following cell configures:
- Number of GPUs to use (1 is sufficient for deployment)
- Working directories for the TAO environment
- Local directories for data storage and experiment outputs

**Important:** Make sure to update the `LOCAL_PROJECT_DIR` path to match your system configuration if needed.

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/peoplenet_onnx
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/detectnet_v2

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/detectnet_v2
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.

os.environ["LOCAL_PROJECT_DIR"] = FIXME

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "peoplenet"
)

# Make the experiment directory 
! mkdir -p $LOCAL_EXPERIMENT_DIR


## 2. Installing NGC Command Line Interface

To download the pre-trained PeopleNet model, we need NVIDIA's NGC CLI (Command Line Interface) tool. NGC (NVIDIA GPU Cloud) hosts optimized deep learning models, frameworks, and software.

The following cell:
1. Sets up the environment for NGC CLI installation
2. Downloads the NGC CLI package
3. Extracts and configures the tool for use
4. Updates the system path to include NGC CLI

This enables us to seamlessly access NVIDIA's pre-trained and optimized models.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

## 3. Preparing Output Directory

Before downloading the model, we'll create a dedicated directory to store the quantized ONNX model files. This keeps our project organized and ensures we have a clean location for the downloaded assets.

The quantized model is critical for efficient deployment on resource-constrained edge devices like the Infineon PSOC EDGE, as it uses INT8 precision instead of FP32, significantly reducing memory footprint and computational requirements.

In [4]:
!mkdir -p !mkdir -p $LOCAL_EXPERIMENT_DIR/quantized_onnx_model

## 4. Downloading Pre-trained and Optimized PeopleNet Model

Now we'll download NVIDIA's pre-trained, pruned, and quantized PeopleNet model from NGC. This model has already undergone several optimization steps:

1. **Training** - Trained on large datasets for person detection
2. **Pruning** - Removal of redundant connections to reduce model size
3. **Quantization** - Precision reduction from FP32 to INT8 

These optimizations make it ideal for edge deployment without requiring us to perform the time-consuming training process.

The model is downloaded in ONNX format, which is an open standard for representing deep learning models that allows interoperability between different frameworks.

In [None]:
!ngc registry model download-version "nvidia/tao/peoplenet:pruned_quantized_decrypted_v2.3.4" \
    --dest $LOCAL_EXPERIMENT_DIR/quantized_onnx_model

## 5. Converting the Model for Infineon PSOC EDGE Deployment

In this critical step, we'll convert the quantized ONNX model to a format optimized for Infineon PSOC EDGE devices with Arm Ethos-U55 NPU acceleration.

The conversion process:
1. Uses Infineon's IFX tooling to analyze the model architecture
2. Configures the target hardware (Ethos-U55-128 NPU)
3. Sets system parameters specific to PSOC EDGE (PSE84_M55_U55_400MHz)
4. Optimizes memory usage with SRAM-only configuration
5. Preserves INT8 quantization for maximum efficiency

This step transforms the general-purpose ONNX model into a highly optimized deployment package specifically for the Infineon PSOC EDGE hardware architecture.

**Note:** The input shape parameter [1, 3, 544, 960] represents:
- Batch size: 1 (processing one image at a time)
- Channels: 3 (RGB color inputs)
- Height: 544 pixels
- Width: 960 pixels

These dimensions must match the expected input for the PeopleNet model and should be maintained in your application deployment.

In [None]:
from ifx_tooling import run_ifx_tooling, ModelConversionError
import os
from pathlib import Path

qat_onnx_model_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], "quantized_onnx_model/peoplenet_vpruned_quantized_decrypted_v2.3.4/resnet34_peoplenet_int8.onnx")
ifx_tooling_output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], "ifx_tooling")

config = {
    'vela_accelerator': 'ethos-u55-128',
    'vela_system_config': 'PSE84_M55_U55_400MHz',
    'vela_memory_mode': 'Sram_Only',
    'compress_to_fp16': False,
    'vela_ini_file_path': os.path.join(os.environ['LOCAL_PROJECT_DIR'], "vela.ini")
}

try:
    output_paths = run_ifx_tooling(
        onnx_model_path=qat_onnx_model_path,
        input_shape=[1, 3, 544, 960],
        output_dir=ifx_tooling_output_path,
        config=config
    )
    print("Generated artifacts:", output_paths)
except ModelConversionError as e:
    print(f"Conversion failed: {e}")

## 6. Next Steps for Deployment

After successfully converting the model, you'll find the deployment artifacts in the `ifx_tooling` output directory. To deploy this model on your Infineon PSOC EDGE device:

1. Transfer the generated artifacts to your PSOC EDGE development environment
2. Use Infineon ModusToolbox™ to incorporate these files into your application
3. Implement pre-processing to format input data correctly (resizing, normalization)
4. Add post-processing to interpret model outputs (bounding box rendering, threshold filtering)
5. Optimize the frame capture and display pipeline for your specific use case

For optimal performance on PSOC EDGE devices:
- Consider reducing input resolution for faster inference if your application permits
- Implement frame skipping for video inputs to balance performance and power consumption
- Use the Arm CMSIS-NN libraries for any additional processing that can't be accelerated by the NPU
- Profile your application to identify bottlenecks and optimize accordingly

Refer to Infineon's ModusToolbox™ documentation for detailed integration instructions specific to your target hardware.