QuickStart for Deploying a Basic Model on the Triton Inference Server
-
Updated
Aug 26, 2023 - Python
QuickStart for Deploying a Basic Model on the Triton Inference Server
Add Some plus extra features to transformers
A template for delpoy AI server use django with tf_serving or triton_inference_serving
This repository utilizes the Triton Inference Server Client, which streamlines the complexity of model deployment.
The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.
Provides an ensemble model to deploy a YoloV8 ONNX model to Triton
Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.
Deploy DL/ ML inference pipelines with minimal extra code.
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
Add a description, image, and links to the triton-server topic page so that developers can more easily learn about it.
To associate your repository with the triton-server topic, visit your repo's landing page and select "manage topics."