A high-performance, multi-threaded C++ pipeline for real-time multi-camera keypoint detection.
Developed as part of my PhD thesis, this module enables 3D human pose estimation from bounding box proposals generated by my detection pipeline.
This module supports deployment in robotic systems for real-time tracking and perception and is part of my ROS/ROS2 real-time 3D tracker and its docker-implementation.
- Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz, Nvidia 2080 super, Ubuntu 20.04, CUDA 11.8, TensorRT 8.6.1.6, OpenCV 4.10.0 with RTMPose and BATCH_SIZE of 5 -> Preprocess: ~1ms, NN inference ~4ms, Postprocess: ~1ms (1000 samples)
If you use this software, please use the GitHub “Cite this repository” button at the top(-right) of this page.
This repository is designed to run inside the Docker 🐳 container provided here:
OpenCV-TRT-DEV
It includes all necessary dependencies (CUDA, cuDNN, OpenCV, TensorRT, CMake).
In addition to the libraries installed in the container, this project relies on:
- 📦 tensorrt-cpp-api (fork)
(Originally by cyrusbehr) - 🧵 cpp-utils
(Handles multithreading, JSON config parsing, and utility tools)
Set the required variables (usually done via .env
or your shell):
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets system-wide batch size
If
N_CAMERAS
is not set, CMake will default to a batch size of 5.
Use the trt.sh
script in ./scripts
to convert your .onnx model to a fixed batch size.
- The batch size is treated as a hardware constraint, defined by the number of connected cameras.
- You can change the default batch size in
CMakeLists.txt
to fit your system. - Although this repo is optimized for YOLOv8 models, you can modify the post-processing stage to support any ONNX-compatible detection model.
Run the build and installation script:
sudo ./build_install.sh
This will configure the build system, compile the inference pipeline, and generate the binaries.
Before using the pipeline, ensure the following:
These should be defined in your .env
file or shell environment:
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets batch size (defaults to 5)
If
N_CAMERAS
is not set, the system assumes a default of 5 cameras.
This repo is designed for trained RTMPose models exported as .onnx
.
The model must be exported with a fixed batch size matching your multi-camera setup.
CAdapt the configuration files in the cfg/
folder to reflect your system and model setup.
You can change the default batch size in CMakeLists.txt
if needed.
After configuring your setup:
./build/inference_benchmark
This runs the inference pipeline, processes multi-camera input, and saves images with overlayed bounding boxes and labels to the inputs/
folder.
This executable iterates over a directory of synchronized .mp4 videos and saves the result for each video in a .json file.
This example usage assumes <BATCH_SIZE> .mp4 videos in an arbitrary ./test
directory
./build/video_inference_export test
This executable iterates over a directory of synchronized .mp4 videos and exported inference results (from ./build/video_inference_export
). It generates new .mp4 videos with detections and a tiled video similar to the .gif in this readme.
This example usage assumes <BATCH_SIZE> .mp4 videos and .json files in an arbitrary ./test
directory
./build/bbox_overlay test
This inference module is optimized for:
- 3D multi-camera human pose estimation
- Online tracking and interaction
- Real-time robotics perception pipelines