serving

Feb 5, 2025

47cddc5 · Feb 5, 2025

Name	Name	Last commit message	Last commit date
parent directory ..
docs	docs	[XPU] Update XPU L3 Cache setting docs (#2001 )	May 30, 2023
scripts	scripts	[Bug Fix] fix build xpu encrypt & auth image scripts (#2133 )	Jul 24, 2023
src	src	Add ORT fp16 support in server (#2069 )	Jul 5, 2023
CMakeLists.txt	CMakeLists.txt	support build cpu images (#341 )	Oct 11, 2022
Dockerfile	Dockerfile	[Server] Support encrypt & auth for FD Server (#2018 )	Jun 13, 2023
Dockerfile_CUDA_11_2	Dockerfile_CUDA_11_2	[Serving] add fastdeployserver dockerfile for cuda11.2 (#1169 )	Jan 30, 2023
Dockerfile_CUDA_11_2_TRT_8_5_PADDLE_2_4_2	Dockerfile_CUDA_11_2_TRT_8_5_PADDLE_2_4_2	[Server] Support encrypt & auth for FD Server (#2018 )	Jun 13, 2023
Dockerfile_CUDA_11_4_TRT_8_4	Dockerfile_CUDA_11_4_TRT_8_4	[Server] Support encrypt & auth for FD Server (#2018 )	Jun 13, 2023
Dockerfile_cpu	Dockerfile_cpu	[Serving]modify docker images name (#992 )	Dec 27, 2022
Dockerfile_ipu	Dockerfile_ipu	[Serving]: add ipu support for serving. (#10 ) (#470 )	Nov 2, 2022
Dockerfile_xpu	Dockerfile_xpu	[Serving] Support FastDeploy XPU Triton Server (#1994 )	May 29, 2023
Dockerfile_xpu_encrypt_auth	Dockerfile_xpu_encrypt_auth	[Serving] Support XPU encrypt & auth server (#2007 )	Jun 1, 2023
README.md	README.md	update docs	Feb 5, 2025
README_CN.md	README_CN.md	update docs	Feb 5, 2025

README.md

简体中文 | English

FastDeploy Serving Deployment

Introduction

FastDeploy builds an end-to-end serving deployment based on Triton Inference Server. The underlying backend uses the FastDeploy high-performance Runtime module and integrates the FastDeploy pre- and post-processing modules to achieve end-to-end serving deployment. It can achieve fast deployment with easy-to-use process and excellent performance.

FastDeploy also provides an easy-to-use Python service deployment method, refer PaddleSeg deployment example for its usage.

Prepare the environment

Environment requirements

Linux
If using a GPU image, NVIDIA Driver >= 470 is required (for older Tesla architecture GPUs, such as T4, the NVIDIA Driver can be 418.40+, 440.33+, 450.51+, 460.27+)

Obtain Image

CPU Image

CPU images only support Paddle/ONNX models for serving deployment on CPUs, and supported inference backends include OpenVINO, Paddle Inference, and ONNX Runtime

docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-cpu-only-21.10

GPU Image

GPU images support Paddle/ONNX models for serving deployment on GPU and CPU, and supported inference backends including OpenVINO, TensorRT, Paddle Inference, and ONNX Runtime

docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-gpu-cuda11.4-trt8.5-21.10

Users can also compile the image by themselves according to their own needs, referring to the following documents:

FastDeploy Serving Deployment Image Compilation

Note: The proxy settings have been pre-configured for this image. If these settings are not required for your environment, you can disable the proxy by executing the commands unset https_proxy and unset http_proxy.

Task	Model
Classification	PaddleClas
Detection	PaddleDetection
Detection	ultralytics/YOLOv5
NLP	PaddleNLP/ERNIE-3.0
NLP	PaddleNLP/UIE
Speech	PaddleSpeech/PP-TTS
OCR	PaddleOCR/PP-OCRv3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

serving

serving

README.md

FastDeploy Serving Deployment

Introduction

Prepare the environment

Environment requirements

Obtain Image

CPU Image

GPU Image

Other Tutorials

Serving Deployment Demo

Files

serving

Directory actions

More options

Directory actions

More options

Latest commit

History

serving

Folders and files

parent directory

README.md

FastDeploy Serving Deployment

Introduction

Prepare the environment

Environment requirements

Obtain Image

CPU Image

GPU Image

Other Tutorials

Serving Deployment Demo