yolov5-triton

YOLO v5 Object Detection on Triton Inference Server

What does this application do?

This application demonstrates the following things.

How to prepare TensorRT model for NVIDIA Triton Inference Server
How to launch NVIDIA Triton Inference Server
How to form a pipeline with the model ensemble
How to implement client applications for Triton Inference Server

Model Pipeline

The below pipeline is formed with the model ensemble.

Order	Model Name	Backend	Input Type	Input Dimension	Output Type	Output Dimension	Description
1	preprocess	Python	UINT8	[3, 384, 640]	FP32	[3, 384, 640]	Type Conversion Normalization
2	yolov5s_trt	TensorRT	FP32	[3, 384, 640]	FP32	[15120, 85]	Object Detection
3	postprocess	Python	FP32	[15120, 85]	FP32	[1, -1, 6]	Bounding Box Generation Non-Maximum Suppression

The pipeline output [1, -1, 6] consists of 1 * N * [x0, y0, x1, y1, score, class].
N : The number of the detected bounding boxes
(x0, y0) : The coordinate of the top-left corner of the detected bounding box
(x1, y1) : The coordinate of the bottom-right corner of the detected bounding box

Prerequisites

Server

Jetson Xavier/Orin or x86_64 Linux with NVIDIA GPU
For Jetson, JetPack 5.0.2 or later
For x86_64, NGC account

Client

Linux(x86_64/ARM64) or Windows(x86_64)
No GPU resource needed for client

Server Installation (for Jetson)

Clone this repository

git clone https://github.com/MACNICA-CLAVIS-NV/yolov5-triton

cd yolov5-triton/server

Launch PyTorch container
```
./torch_it.sh
```

Obtain YOLO v5 ONNX model

pip3 install -U \
	'protobuf<4,>=3.20.2' \
	numpy \
	onnx \
	pandas \
	PyYAML \
	tqdm \
	matplotlib \
	seaborn \
	psutil \
	gitpython \
	scipy \
	setuptools

python3 torch2onnx.py yolov5s

Covert ONNX model to TensorRT engine

/usr/src/tensorrt/bin/trtexec \
	--onnx=yolov5s.onnx \
	--saveEngine=model.plan \
	--workspace=4096 \
	--exportProfile=profile.json

Copy TensorRT engine to model repository

cp model.plan ./model_repository/yolov5s_trt/1/

Exit from PyTorch container
```
exit
```
Build a docker image for Triton Inference Server
```
./triton_build.sh
```

Server Installation (for x86_64)

Need NGC account

Clone this repository

git clone https://github.com/MACNICA-CLAVIS-NV/yolov5-triton

cd yolov5-triton/server

Launch PyTorch container
```
./torch_it_x86.sh
```

Obtain YOLO v5 ONNX model

pip3 install \
	protobuf \
	pandas \
	PyYAML \
	tqdm \
	matplotlib \
	seaborn \
	gitpython

python3 torch2onnx.py yolov5s

Covert ONNX model to TensorRT engine

/usr/src/tensorrt/bin/trtexec \
	--onnx=yolov5s.onnx \
	--saveEngine=model.plan \
	--workspace=4096 \
	--exportProfile=profile.json

Copy TensorRT engine to model repository

cp model.plan ./model_repository/yolov5s_trt/1/

Exit from PyTorch container
```
exit
```

Run Server (for Jetson)

sudo jetson_clocks

./triton_start_grpc.sh

Run Server (for x86_64)

./triton_start_grpc_x86.sh

Install Client

The client application does not need GPU resource. It can be deployed to Windows/Linux without GPU card. Virtual python environment like conda or venv is recommened.

Clone this repository

git clone https://github.com/MACNICA-CLAVIS-NV/yolov5-triton

cd yolov5-triton/client

Install Python dependencies

pip install tritonclient[all] Pillow opencv-python

Run Client

Image Input Inference

python infer_image.py [-h] [--url SERVER_URL] IMAGE_FILE

Example:

python infer_image.py --url localhost:8000 test.jpg

Camera Input Inference

python infer_camera.py [-h] [--camera CAMERA_ID] [--width CAPTURE_WIDTH] [--height CAPTURE_HEIGHT] [--url SERVER_URL]

Example:

python infer_camera.py --camera 1 --width 640 --height 480 --url 192.168.XXX.XXX:8000

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.vscode		.vscode
client		client
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

yolov5-triton

Table of Contents

What does this application do?

Model Pipeline

Prerequisites

Server

Client

Server Installation (for Jetson)

Server Installation (for x86_64)

Run Server (for Jetson)

Run Server (for x86_64)

Install Client

Run Client

Image Input Inference

Camera Input Inference

About

Releases

Packages

Languages

MACNICA-CLAVIS-NV/yolov5-triton

Folders and files

Latest commit

History

Repository files navigation

yolov5-triton

Table of Contents

What does this application do?

Model Pipeline

Prerequisites

Server

Client

Server Installation (for Jetson)

Server Installation (for x86_64)

Run Server (for Jetson)

Run Server (for x86_64)

Install Client

Run Client

Image Input Inference

Camera Input Inference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages