Skip to content

Releases: PaddlePaddle/FastDeploy

FastDeploy 1.0.0

28 Nov 06:20
d38aa45
Compare
Choose a tag to compare

1.0.0 Release Note

全场景高性能AI部署工具⚡️FastDeploy 1.0.0正式发布!🎉 支持飞桨及开源社区150+模型的多硬件高性能部署,为开发者提供简单全场景简单易用极致高效的全新部署体验!

多推理后端与多硬件支持

FastDeploy支持在多种硬件上以不同后端的方式进行推理部署,各后端模块可根据开发者需求灵活编译集成,自行编译参考 FastDeploy编译文档

后端 平台 支持模型格式 支持硬件
Paddle Inference Linux(x64)/Windows(x64) Paddle x86 CPU/NVIDIA GPU/Jetson/GraphCore IPU
Paddle Lite Linux(aarch64/armhf)/Android Paddle Arm CPU/Kunlun R200/RV1126
Poros Linux(x64) TorchScript x86 CPU/NVIDIA GPU
OpenVINO Linux(x64)/Windows(x64)/OSX(x86) Paddle/ONNX x86 CPU/Intel GPU
TensorRT Linux(x64/aarch64)/Windows(x64) Paddle/ONNX NVIDIA GPU/Jetson
ONNX Runtime Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64) Paddle/ONNX x86 CPU/Arm CPU/NVIDIA GPU

除此之外,FastDeploy也基于Paddle.js 支持模型在网页前端及智能小程序部署工具,参阅 Web部署 了解更多细节。

丰富的AI模型端到端推理

FastDeploy支持如下飞桨模型套件的端到端部署

除飞桨开发套件外,FastDeploy同时支持了开源社区内热门深度学习模型的部署,目前v1.0共完成150+模型的支持,下表为部分重点模型的支持情况,阅读 部署示例 了解更多详细内容。

场景 支持模型
图像分类 ResNet/MobileNet/PP-LCNet/YOLOv5-Clas等系列模型
目标检测 PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet等系列模型
语义分割 PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet等系列模型
图像/视频抠图 PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting
文字识别 PP-OCRv2/PP-OCRv3
视频超分 PP-MSVSR/BasicVSR/EDVR
目标跟踪 PP-Tracking
姿态/关键点识别 PP-TinyPose/HeadPose-FSANet
人脸对齐 PFLD/FaceLandmark1000/PIPNet等系列模型
人脸检测 RetinaFace/UltraFace/YOLOv5-Face/SCRFD等系列模型
人脸识别 ArcFace/CosFace/PartialFC/VPL/AdaFace等系列模型
语音合成 PaddleSpeech 流式语音合成模型
语义表示 PaddleNLP ERNIE 3.0 Tiny系列模型
信息抽取 PaddleNLP 通用信息抽取UIE模型
文图生成 Stable Diffusion

高性能服务化部署

FastDeploy基于 Triton Inference Server 提供服务化部署能力。支持Paddle/ONNX模型在不同硬件以及不同后端上的高性能服务化部署体验。

自动化压缩与模型转换

PaddleSlim自动化压缩

FastDeploy基于 PaddleSlim 提供一键量化工具,通过如下命令快速完成模型的无损压缩加速。

fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml \
                    --method='PTQ' --save_dir='./yolov5s_ptq_model/'  

目前FastDeploy已完成量化模型与如下后端的适配测试

硬件/推理后端 ONNX Runtime Paddle Inference TensorRT Paddle Inference TensorRT Paddle Lite
CPU 支持 支持 - - 支持
GPU - - 支持 支持 -
RK1126 - - - - 支持

自动压缩精度与性能对比如下表所示,精度近乎无损,性能最高提升400%
image

一键压缩的更多细节与使用方式,参阅FastDeploy一键压缩功能

模型转换

为了便于对多框架模型的部署支持,FastDeploy预置了 X2Paddle 转换能力,在安装FastDeploy后,通过如下命令可快速完成转换,并通过FastDeploy部署。

fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir yolov5s_paddle_model

更多使用方式,参阅FastDeploy模型转换

端到端部署性能优化

FastDeploy在各模型的部署中,重点关注端到端到的部署体验和性能。在1.0版本中,FastDeploy在端到端进行如下优化

  • 服务端对预处理过程进行融合,降低内存创建开销和计算量
  • 移动端集成百度视觉技术部自研高性能图像处理库 FlyCV

结合FastDeploy多后端支持的优势,相较原有部署代码,所有模型端到端性能大幅提升,下表为其中部分模型的测试数据,
bf6bea741738e3e2944945cda30d95c2

1.0.0 Release Note

We are excited to announce the release of ⚡️FastDeploy 1.0.0! 🎉 FastDeploy supports high performance end-to-end deployment for over 150 AI models from PaddlePaddle and other open source community on multiple hardware.

Multiple Inference Backend and Hardware Support

FastDeploy supports inference deployment on multiple hardware with different backends, each backend module can be flexibly compiled and integrated according to the developer's needs, please refer to FastDeploy compilation documentation

Backend Platform Model Format Supported Hardware in FastDeploy
Paddle Inference Linux(x64)/Windows(x64) Paddle x86 CPU/NVIDIA GPU/GraphCore IPU
Paddle Lite Linux(aarch64/armhf)/Android Paddle Arm CPU/Kunlun R200/RV1126
Poros Linux(x64)/Windows(x64) TorchScript x86 CPU/NVIDIA GPU
OpenVINO Linux(x64)/Windows(x64)/OSX(x86) Paddle/ONNX x86 CPU/Intel GPU
TensorRT Linux(x64/aarch64)/Windows(x64) Paddle/ONNX NVIDIA GPU/Jetson
ONNX Runtime Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64) Paddle/ONNX x86 CPU/Arm CPU/NVIDIA GPU

In addition, FastDeploy also supports the deployment of models on the web and mini application based on Paddle.js, see Web Deployment for more details.

AI Model End-to-end Inference Support

FastDeploy supports end-to-end deployment of the following PaddlePaddle models are as follows:

In addition, FastDeploy also supports the deployment of popular deep learning models in the open source community. over 150 models are currently supported in release 1.0, the table below shows some of the key models supported, refer to deployment examples for more details.

Task Supported Models
Classification ResNet/MobileNet/PP-LCNet/YOLOv5-Clas and other series models
Object Detection PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet and other series models
Segmentation PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet and other series models
Image/Video Matting PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting
OCR PP-OCRv2/PP-OCRv3
Video Super-Resolution PP-MSVSR/BasicVSR/EDVR
Object Tracking PP-Tracking
Posture/Key-point Recognition PP-TinyPose/HeadPose-FSANet
Face Align PFLD/FaceLandmark1000/PIPNet and other series models
Face Detection RetinaFace/UltraFace/YOLOv5-Face/SCRFD and other series models
Face Recognition ArcFace/CosFace/PartialFC/VPL/AdaFace and other series models
Text-to-Speech PaddleSpeech Streaming Speech Synthesis Model
Semantic Representation PaddleNLP ERNIE 3.0 series models
Information Extraction PaddleNLP Universal Information Extraction UIE model
Content Generation Stable Diffusion

High Performance Serving Deployment

⚡️FastDeploy provides hig...

Read more

FastDeploy 0.8.0

22 Nov 09:54
38e9645
Compare
Choose a tag to compare

0.8.0 Release Note

  • 新增PIPNet、FaceLandmark1000人脸对齐模型的部署支持 详情
  • 新增视频超分系列模型 PP-MSVSR、EDVR、BasicVSR 详情
  • 升级YOLOv7部署代码,增加批量预测部署支持 #611
  • 新增UIE服务化部署案例 详情
  • 修复ArcFace示例代码中Cosine Similarity计算问题 #648
  • [测试功能] 新增OpenVINO后端Device设置,支持集显/独立显卡的调用 #472
  • 新增Android图像分类、目标检测、语义分割、OCR、人脸检测 APK工程及示例
图像分类 目标检测 语义分割 文字识别 人脸检测
工程代码 工程代码 工程代码 工程代码 工程代码
扫码或点击链接安装试用 扫码或点击链接安装试用 扫码或点击链接安装试用 扫码或点击链接安装试用 扫码或点击链接安装试用

0.8.0 Release Note

  • Support PIPNet, FaceLandmark1000 face alignment models deployment Details
  • Support Video Super-Resolution series model PP-MSVSR、EDVR、BasicVSR Details
  • Upgrade YOLOv7 deployment code to add batch_predict deployment #611
  • Support UIE service-based deployment Details
  • Fix a bug with the Cosine Similarity calculation in the ArcFace sample code #648
  • [Test functions] Support OpenVINO backend Device settings, support for integrated/discrete graphics card #472
  • Support Android image classification, target detection, semantic segmentation, OCR, face detection APK projects and examples
Image Classification Object Detection Semantic Segmentation OCR Face Detection
Project Code Project Code Project Code Project Code Project Code
Scan the code or click on the link to install and try out Scan the code or click on the link to install try out Scan the code or click on the link to install and try out Scan the code or click on the link to install and try out Scan the code or click on the link to install and try out

New Contributors

Full Changelog: release/0.7.0...release/0.8.0

FastDeploy 0.7.0 Release Note

16 Nov 02:31
2f73857
Compare
Choose a tag to compare

0.7.0 Release Note

  • 新增Paddle Lite TIM-VX集成,支持RK1芯片上的部署 详情
  • 人脸检测模型SCRFD模型新增RKNPU2的部署支持 部署示例
  • 新增Stable Diffusion模型部署示例 部署示例
  • PaddleClas/PaddleDetection/YOLOv5部署代码升级,支持predictbatch_predict
  • 支持大于2G以上的Paddle模型转ONNX部署
  • 新增PaddleClas模型服务化部署案例 部署案例
  • 针对FDTensor增加Pad function操作符,支持在batch预测时,对输入进行Padding
  • 针对FDTensor增加Python API to_dlpack接口,支持FDTensor在不同框架间的无拷贝传输

0.7.0 Release Note

  • Integrate Paddle Lite TIM-VX for supporting hardware such as Rockchip RV1126 . Details
  • Support Face detection model SCRFD on Rockchip RK3588, RK3568 and other hardware.
  • Support Stable Diffusion model deployment.
  • Upgrade PaddleClas、PaddleDetection、YOLOv5 deployment code to support predict and batch_predict;
  • Support for Paddle model to ONNX deployments larger than 2G.
  • Support PaddleClas model service-based deployment.
  • Add the Pad function operator for the FDTensor to support Padding of the input during batch prediction.
  • Add Python API to_dlpack interface for FDTensor to support copyless transfer of FDTensor between frameworks.

New Contributors

Full Changelog: release/0.6.0...release/0.7.0

FastDeploy 0.6.0 Release Note

08 Nov 12:32
c7277ef
Compare
Choose a tag to compare

0.6.0 Release Note

模型

  • 新增FSANet头部姿态识别模型 详情
  • 新增PFLD人脸对齐模型 详情
  • PP-Tracking模型增加轨迹可视化 详情
  • 新增ERNIE文本分类模型 详情

服务化部署

  • FastDeploy Runtime新增Clone接口支持,降低Paddle Inference/TensorRT/OpenVINO后端在多实例下内存/显存的使用

端侧部署

  • 新增RKNPU2(3588)部署支持 详情

性能优化

  • 优化YOLO系列、PaddleClas、PaddleDetection前后处理内存创建逻辑
  • 融合视觉预处理操作,优化PaddleClas、PaddleDetection预处理性能
  • 集成TensorRT BatchedNMSDynamic_TRT插件,提升TensorRT端到端部署性能

其它

  • 修复若干文档问题
  • 增加FastDeploy Runtime C++使用示例 详情

0.6.0 Release Note

Model

  • Support FSANet head pose recognition model Details
  • Support PFLD face alignment model Details
  • PP-Tracking model adds track visualisation Details
  • Support ERNIE text classification model Details

Service-based Deployment

  • FastDeploy Runtime Adds Clone interface support for service-based deployment, reducing the memory、GPU memory usage of Paddle Inference、TensorRT、OpenVINO backend in multiple instances.

Edge Deployment

  • Support RKNPU2(3588) Details.

Performance Optimisation

  • Optimize preprocessing and postprocessing memory creation logic on YOLO series, PaddleClas, PaddleDetection.
  • Integrate visual preprocessing operations, optimize the preprocessing performance of PaddleClas and PaddleDetection, and improve end-to-end performance.
  • Integrating the TensorRT BatchedNMSDynamic_TRT plugin to improve the performance of TensorRT end-to-end deployments.

Others

  • Fixing several documentation issues
  • Adding FastDeploy Runtime C++ usage examples Details

New Contributors

Full Changelog: release/0.4.0...release/0.6.0

FastDeploy 0.5.0

31 Oct 12:45
e2dd8e2
Compare
Choose a tag to compare

What's Changed

后端

  • 新增通过Paddle Inference TensorRT推理支持
  • 新增通过Paddle Inference在IPU硬件上的推理支持
  • 解决原生TensorRT无法支持输入输出INT64数据问题
  • ONNX Runtime、Paddle Inference、TensorRT后端添加多流支持

模型

  • 新增跟踪模型PP-Tracking 示例
  • 新增RobustVideoMatting视频模型 示例
  • 新增FastDeploy模型集成开发流程文档 文档

其它

  • 修复非固定Shape情况下PP-Matting的预测问题
  • 修复语义分割模型Python可视化函数问题
  • 修复部分模型使用文档

New Contributors

Full Changelog: release/0.4.0...release/0.5.0

FastDeploy 0.4.0

23 Oct 06:24
9bc5b11
Compare
Choose a tag to compare

0.4.0版本新增Android移动端部署支持!

What's Changed

移动端部署

  • 增加FastDeploy Android C++预测库,支持arm64-v8a和armeabi-v7a架构,详见 预编译库下载
  • 增加目标检测模型PicoDet的Android部署,详见示例
  • 增加图像分类PaddleClas系列模型的Android部署,详见示例

模型

  • 优化YOLOv5/6/7 GPU部署端到端性能,通过YOLOv5::UseCudaPreprocessing()启用GPU前处理后,T4 GPU(TensorRT)上性能提升30%~50%,详见PR说明 #370
  • 增加7个Web端js部署案例,详见js部署示例
  • 增加TinyPose以及PicoDet+TinyPose串联Pipeline部署支持,详见示例
  • 增加Torch Vision ResNet系列模型的部署支持,详见示例
  • PPOCRSystemv2 & PPOCRSystemv3重命名为PPOCRv2 & PPOCRv3
  • 优化PaddleSeg & PaddleOCR中部分模型警告信息

服务化部署

  • 增加语义模型TTS服务化部署,详见示例
  • 增加ERNIE 3.0服务化部署,详见示例
  • 修复服务化CPU部署镜像中的core问题

推理后端

  • GPU部署增加EnablePinedMemory接口,支持Paddle Inference和TensorRT推理时,使用Pinned Memory,提升数据从GPU拷贝至CPU的传输生能,详见PR #403

文档(仍在完善中)

New Contributors

Full Changelog: release/0.3.0...release/0.4.0

FastDeploy v0.3.0

15 Oct 14:02
3ff562a
Compare
Choose a tag to compare

What's Changed

模型

  • 新增PaddleSeg的PP-ModNet和PP-HumanMatting部署支持 部署示例
  • 新增YOLOv5-Classification模型部署支持 部署示例

量化加速

  • 基于PaddleSlim提供一键量化工具,支持CPU/GPU上部署性能的倍速提升 详细内容
  • 支持YOLO系列和PaddleClas图像分类系列模型一键量化加速 详细内容

编译

  • 支持用户环境指定自定义路径下的OpenCV、OpenVINO、ONNX Runtime编译依赖
  • Mac x86上增加OpenVINO后端的编译支持
  • 增加arm上Paddle-Lite的后端支持
  • 支持Jetson上编译安装 参考文档

服务化部署

  • 发布FastDeploy-Triton服务化CPU/GPU部署镜像,支持Paddle/ONNX模型的多后端的高性能服务化部署 详细内容
  • 新增YOLOv5服务化部署示例 详细内容

代码优化

  • 解决模型Predict时修改传入图像的问题
  • 增加TensorRT后端max_workspace_size设置接口
  • 优化PaddleSeg部署模型在动态Shape下的提示信息
  • 修复Windows上加载TensorRT序列化文件失败的问题
  • 增加fastdeploy_init.shfastdeploy_init.bat帮助开发者快速导入FastDeploy依赖库

New Contributors

Full Changelog: release/0.2.1...release/0.3.1

FastDeploy v0.2.1

17 Sep 14:49
79c3dcc
Compare
Choose a tag to compare

What's Changed

模型

推理后端

  • 新增OpenVINO推理后端,得益于OpenVINO团队的支持,大部分Paddle模型均已支持使用OpenVINO在CPU上加速推理
  • TensorRT优化使用体验,无需再手动调用SetTrtInputShape设置输入范围,改为默认在推理过程中动态设置
    参阅文档如何切换推理后端了解更多详情

使用体验

  • 新增部分使用文档,包含编译、SDK使用等
  • 优化Windows上编译,使用中的部分易用性问题

New Contributors

Full Changelog: release/0.2.0...release/0.2.1

FastDeploy v0.2.0

18 Aug 13:49
b9b733b
Compare
Choose a tag to compare

多推理后端支持

  • 集成Paddle Inference、ONNX Runtime、TensorRT后端,并支持根据模型自动选择最佳推理后端。
  • 支持源码编译,更灵活地选择后端,可参考 FastDeploy编译文档

更多视觉模型支持

文档优化

FastDeploy v0.1.0

27 Jun 11:59
Compare
Choose a tag to compare

⚡️FastDeploy v0.1.0测试版发布!🎉
💎 发布40个重点模型在8种重点软硬件环境的支持的SDK
😊 支持网页端、pip包两种下载使用方式