Release v0.7.0

TeslaZhao released this 16 Nov 15:02

· 69 commits to v0.7.0 since this release

e734712

新特性

集成Intel MKLDNN加速推理 #1264，#1266, #1277
C++ Serving支持HTTP 请求 #1321
C++ Serving支持gPRC 和HTTP + Proto请求 #1345
新增C++ Client SDK #1370

性能优化

C++ Serving优化Pybind传递数据方法 #1268, #1269
C++ Serving增加GPU多流、异步任务队列，删除冗余加锁 #1289
C++ Serving WebServer使用连接池和数据压缩 #1348
C++ Serving框架新增异步批量合并，支持变长LOD输入 #1366
C++ Serving stage并发执行 #1376
C++ Serving增加各阶段处理耗时日志 #1390

功能变更

重写模型保存方案和命名规则，兼容旧版本 #1354，#1358
支持更多数据类型float64，int16, float16, uint16, uint8, int8, bool , complex64 , complex128 #1338
重写GPU id设置device的逻辑 #1303
指定Fetch list返回部分推理结果 #1359
设置XPU ID #1436
服务优雅关闭 #1470
C++ Serving Client端pybind支持uint8、int8数据读写 #1378
C++ Serving Client端pybind支持uint16、int16数据读写 #1420
C++ Serving支持异步参数设置 #1483
Python Pipeline增加While OP控制循环 #1338
Python pipeline之间可使用gRPC交互 #1358
Python Pipeline 支持Proto结构Tensor数据格式交互 #1369， #1384
Python Pipeline仅获取最快的前置OP结果 #1380
Python Pipeline 支持LoD类型输入 #1472
Cube服务新增python http方式请求样例 #1399
Cube服务增加读取RecordFile工具 #1336
Cube-server和Cube-transfer上线部署优化 #1337
删除multi-lang相关代码 #1321

文档和示例变更

修改Doc目录结构，新增子目录 #1473, #1475
迁移Serving/python/examples路径到Serving/examples，重新设计目录 #1487
修改doc文件名称 #1487
新增C++ Serving Benchmark #1176
新增PaddleClas/DarkNet 加密模型部署示例 #1352
新增Model Zoo文档 #1492
新增Install文档 #1473
新增Quick_Start文档 #1473
新增Serving_Configure文档 #1495
新增C++_Serving/Inference_Protocols_CN.md #1500
新增C++_Serving/Introduction_CN.md #1497
新增C++_Serving/Performance_Tuning_CN.md #1497
新增Python_Pipeline/Performance_Tuning_CN.md #1503
更新Java SDK文档 #1357
更新Compile文档 #1502
更新Readme文档 #1473
更新Latest_Package_CN.md #1513
更新Run_On_Kubernetes_CN.md #1520

Bug修复

修复内存池使用问题 #1283
修复多线程中错误加锁问题 #1289
修复C++ Serving多模型组合场景，无法加载第二个模型问题 #1294
修复请求数据大时越界问题 #1308
修复Detection模型结果偏离问题 #1413
修复use_calib设置错误问题 #1414
修复C++ OCR示例结果不正确问题 #1415
修复并行推理出core问题 #1417

For English:

New Features

Integrate Intel MKLDNN #1264，#1266, #1277
C++ Serving supports HTTP requests #1321
C++ Serving supports gPRC and HTTP + Proto requests #1345
Added C++ Client SDK #1370

Performance optimization

C++ Serving optimizes Pybind data transfer method #1268, #1269
C++ Serving adds GPU multi-stream, asynchronous task queue, deletes redundant locks #1289
C++ Serving webserver uses connection pool and data compression #1348
C++ Serving framework adds asynchronous batch merge and supports variable length LOD input #1366
C++ Serving stage concurrent execution #1376
C++ Serving adds time-consuming log processing at each stage #1390

Function changes

Rewrite model saving methods and naming rules, compatible with the old version #1354，#1358
Support more data types float64, int16, float16, uint16, uint8, int8, bool, complex64, complex128 #1338
Rewrite the method of GPU id binding device #1303
Specify Fetch list to return partial inference results #1359
Set XPU ID #1436
Service closed gracefully #1470
C++ Serving Client pybind supports uint8, int8 data #1378
C++ Serving Client pybind supports uint16, int16 data #1420
C++ Serving supports asynchronous parameter setting #1483
Python Pipeline adds While OP control loop #1338
GRPC interaction can be used between Python pipelines #1358
Python Pipeline supports Proto structure Tensor data format interaction #1369， #1384
Python Pipeline only gets the fastest pre-OP results #1380
Python Pipeline supports LoD type input #1472
Cube service adds python http request sample #1399
Cube service adds a tool to read RecordFile #1336
Cube-server and Cube-transfer online deployment optimization #1337
Delete multi-lang related code #1321

Documentation and example changes

Modify the Doc directory structure and add subdirectories #1473, #1475
Move python/examples path to parent directory, and redesign directory #1487
Modify the doc file name #1487
Add C++ Serving Benchmark #1176
Add one PaddleClas/DarkNet encryption model example #1352
Add Model Zoo doc #1492
Add Install doc #1473
Add Quick Start doc #1473
Add Serving Configure doc #1495
Add C++_Serving/Inference_Protocols_CN.md#1500
Add C++_Serving/Introduction_CN.md#1497
Add C++_Serving/Performance_Tuning_CN.md#1497
Add Python_Pipeline/Performance_Tuning_CN.md#1503
Update Java SDK doc #1357
Update Compile doc #1502
Update Readme doc #1473
Update Latest_Package_CN.md#1513
Update Run_On_Kubernetes_CN.md#1520

Bug fix

Fix one memory pool usage problem #1283
Fix the wrong locking problem in multi-threading #1289
Fix the problem of C++ Serving multi-model combination #1294
Fix the problem of out of bounds when the requested data is large #1308
Fix the problem of inaccurate prediction results of the Detection model #1413
Fix the wrong setting of use_calib #1414
Fix the problem of incorrect C++ OCR example results #1415
Fix the core problem of parallel reasoning #1417

Assets 2