triton-inference-server

Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.

Updated Sep 27, 2023
Python

rtzr / tritony

Star

Tiny configuration for Triton Inference Server

inference mlops triton-inference-server tritonclient

Updated Jul 11, 2024
Python

akiragy / recsys_pipeline

Star

Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.

python redis flask elasticsearch retrieval pytorch ranking inverted-index recommender-system recommendation feast vector-database triton-inference-server

Updated Sep 2, 2023
Python

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Star

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

inference pytorch text-detection nvidia-docker inference-server tensorrt inference-engine onnx onnx-torch tensorrt-conversion triton-inference-server text-detection-from-image

Updated Aug 18, 2021
Python

Bobo-y / triton_ensemble_model_demo

Star

triton server ensemble model demo

pipeline triton-inference-server

Updated May 2, 2022
Python

omarabid59 / yolov8-triton

Star

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

deployment triton-inference-server ultralytics triton-server yolov8

Updated Oct 19, 2023
Python

Curt-Park / serving-codegen-gptj-triton

Star

Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes

docker kubernetes pytorch codegen triton-inference-server huggingface-transformers fastertransformer

Updated May 30, 2023
Python

Biano-AI / serving-compare-middleware

Star

FastAPI middleware for comparing different ML model serving approaches

python tensorflow pytorch tensorflow-serving fastapi torchserve triton-inference-server

Updated Jul 5, 2023
Python

inferless / triton-co-pilot

Star

Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments

nvidia triton-inference-server co-pilot

Updated Jul 2, 2024
Python

dpressel / reserve

Star

FastAPI + WebSockets + SSE service to interface with Triton/Riva ASR

sse socketio asr riva fastapi triton-inference-server

Updated Jul 14, 2022
Python

yas-sim / openvino-model-server-wrapper

Star

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

python cloud ai deep-learning grpc intel inference edge object-tracking tensorflow-serving grpc-client model-serving serving openvino line-crossing-detection area-intrusion-detection triton-inference-server openvino-docker openvino-model-server

Updated Jan 16, 2022
Python

ybai789 / yolov8-triton-tensorrt

Star

Provides an ensemble model to deploy a YOLOv8 TensorRT model to Triton

deployment tensorrt triton-inference-server ultralytics yolov8

Updated Mar 28, 2024
Python

levipereira / deepstream-yolo-triton-server-rtsp-out

Star

The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.

deepstream triton-inference-server deepstreamsdk triton-server yolov7 deepstream-python deepstream-python-apps yolov9