Skip to content

tier4/trt-lightnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TensorRT-LightNet: High-Efficiency and Real-Time CNN Implementation on Edge AI

trt-lightNet is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet [1] and TensorRT [2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. trt-lightnet uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices. This is a reproduction of trt-lightnet [6], which generates a TensorRT engine from the ONNX format.

Key Improvements

2:4 Structured Sparsity

trt-lightnet utilizes 2:4 structured sparsity [3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

trt-lightnet also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) [4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, trt-lightnet can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization [5], trt-lightnet also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, trt-lightnet can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

trt-lightnet also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

For Local Installation

  • CUDA 11.0 or later

  • TensorRT 8.5 or 8.6

  • cnpy for debug of tensors This repository has been tested with the following environments:

  • CUDA 11.7 + TensorRT 8.5.2 on Ubuntu 22.04

  • CUDA 12.2 + TensorRT 8.6.0 on Ubuntu 22.04

  • CUDA 11.4 + TensorRT 8.6.0 on Jetson JetPack5.1

  • CUDA 11.8 + TensorRT 8.6.1 on Ubuntu 22.04

For Docker Installation

  • Docker
  • NVIDIA Container Toolkit

This repository has been tested with the following environments:

  • Docker 24.0.7 + NVIDIA Container Toolkit 1.14.3 on Ubuntu 20.04

Steps for Local Installation

  1. Clone the repository.
$ git clone git@github.com:tier4/trt-lightnet.git
$ cd trt-lightnet
  1. Install libraries.
$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev
$ sudo apt install libopencv-dev

Install from the following repository.

https://github.com/rogersce/cnpy

  1. Compile the TensorRT implementation.
$ mkdir build && cd build
$ cmake ../
$ make -j

Steps for Docker Installation

  1. Clone the repository.
$ git clone git@github.com:tier4/trt-lightnet.git
$ cd trt-lightnet
  1. Build the docker image.
$ docker build -t trt-lightnet:latest .
  1. Run the docker container.
$ docker run -it --gpus all trt-lightnet:latest

Model

T.B.D

Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision fp32

Build FP16(HALF) engine

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision fp16

Build INT8 engine
(You need to prepare a list for calibration in "models/calibration_images.txt".)

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision int8 --first true

First layer is much more sensitive for quantization. Threfore, the first layer is not quanitzed using "--first true"

Build DLA engine (Supported by only Xavier and Orin)

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision int8 --first true --dla [0/1]

Inference with the TensorRT engine

Inference from images

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision [fp32/fp16/int8] --first true {--dla [0/1]} --d DIRECTORY

Inference from images

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision [fp32/fp16/int8] --first true {--dla [0/1]} --v VIDEO

Implementation

trt-lightnet is built on the LightNet framework and integrates with TensorRT using the Network Definition API. The implementation is based on the following repositories:

Conclusion

trt-lightnet is a powerful and efficient implementation of CNNs using Edge AI. With its advanced features and integration with TensorRT, it is an excellent choice for real-time object detection and semantic segmentation applications on edge devices.

References

[1]. LightNet
[2]. TensorRT
[3]. Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT
[4]. NVDLA
[5]. Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT
[6]. lightNet-TR

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages