Skip to content

seanavery/async-tensorrt

Repository files navigation

async-tensorrt

async overview

Overview

The challenge is trying to maintain a 30 fps camera stream when inference time alone takes 40 miliseconds. To solve this, you can create a child thread for processing frames. This allows the main thread to run without being blocked by the Cuda execution.

Materials

  1. Linux machine with Cuda (see tutorial 0)
  2. JK Jung's TensorRT Demos
  3. Multithreaded Python
  4. TensorRT Documentation
  5. PyCuda Documentation

The code is a modification from the async exeuction in JK Jung's TensorRT Demos. In my code the main thread is responsible for Video Capture and Display, and the child thread handles inference and processing. This allows inference to execute modulus the incoming frames.

  • 1. Create Python thread
  • 2. Initialize Cuda context inside thread
  • 3. Camera capture in main thread
  • 4. Queue and modulus to throttle inference
  • 5. Locking and global variables
  • 6. Visualize inference results from shared state

About

async cuda inference from realtime camera stream

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages