Skip to content

Files

Latest commit

47cddc5 · Feb 5, 2025

History

History

serving

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
May 30, 2023
Jul 24, 2023
Jul 5, 2023
Oct 11, 2022
Jun 13, 2023
Jan 30, 2023
Jun 13, 2023
Jun 13, 2023
Dec 27, 2022
Nov 2, 2022
May 29, 2023
Jun 1, 2023
Feb 5, 2025
Feb 5, 2025

简体中文 | English

FastDeploy Serving Deployment

Introduction

FastDeploy builds an end-to-end serving deployment based on Triton Inference Server. The underlying backend uses the FastDeploy high-performance Runtime module and integrates the FastDeploy pre- and post-processing modules to achieve end-to-end serving deployment. It can achieve fast deployment with easy-to-use process and excellent performance.

FastDeploy also provides an easy-to-use Python service deployment method, refer PaddleSeg deployment example for its usage.

Prepare the environment

Environment requirements

  • Linux
  • If using a GPU image, NVIDIA Driver >= 470 is required (for older Tesla architecture GPUs, such as T4, the NVIDIA Driver can be 418.40+, 440.33+, 450.51+, 460.27+)

Obtain Image

CPU Image

CPU images only support Paddle/ONNX models for serving deployment on CPUs, and supported inference backends include OpenVINO, Paddle Inference, and ONNX Runtime

docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-cpu-only-21.10

GPU Image

GPU images support Paddle/ONNX models for serving deployment on GPU and CPU, and supported inference backends including OpenVINO, TensorRT, Paddle Inference, and ONNX Runtime

docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-gpu-cuda11.4-trt8.5-21.10

Users can also compile the image by themselves according to their own needs, referring to the following documents:

Note: The proxy settings have been pre-configured for this image. If these settings are not required for your environment, you can disable the proxy by executing the commands unset https_proxy and unset http_proxy.

Other Tutorials

Serving Deployment Demo

Task Model
Classification PaddleClas
Detection PaddleDetection
Detection ultralytics/YOLOv5
NLP PaddleNLP/ERNIE-3.0
NLP PaddleNLP/UIE
Speech PaddleSpeech/PP-TTS
OCR PaddleOCR/PP-OCRv3