Releases: open-mmlab/mmdeploy
MMDeploy Release V0.12.0
Features
- Support Torch JIT Modulated Deformable Conv (#1508)
- Support TorchAllocator as TesnorRT GPU memory allocator (#1493)
- Support TVM backend (#1216)
- Support probability output for segmentation (#1379)
Improvements
- Add pip source in dockerfile (#1492)
- Reformat multi-line logs and strings (#1489)
- Refactor backend manager (#1475, #1522, #1540)
- Add stale workflow to check issue and PR (#1504, #1510)
- Update ppl.nn v0.9.1 and ppl.cv v0.7.1 (#1356)
- Add is_batched argument to pipeline.json (#1560)
- Build monolithic SDK by default (#1577)
Bug fixes
- Fix conversion and inference support for torch 1.13 (#1488)
- Remove cudnn dependency for transform 'mmaction2::format_shape' (#1509)
- Add build-arch option to build script (#1530)
- Fix 'mmaction2::transpose.cu' build failed on cuda-10.2 (#1539)
- Fix 'cannot seek vector iterator' in debug windows build (#1555)
- Fix ops unittest seg-fault error (#1556)
Document
- Add mmaction2 sphinx-doc link (#1541)
- Update FAQ about copying onnxruntime dll to 'mmdeploy/lib' (#1554)
- Update support_new_backend.md (#1574)
Contributors
@PeterH0323 @grimoire @RunningLeon @irexyc @ouonline @tpoisonooo @antoszy @BuxianChen @AllentDan @lzhangzz @hanrui1sensetime
MMDeploy Release V1.0.0rc0
We are excited to announce the release of MMDeploy 1.0.0rc0. MMDeploy 1.0.0rc0 is the first version of MMDeploy 1.x, a part of the OpenMMLab 2.0 projects. Up to the release, MMDeploy 1.x supports OpenMMLab 2.0 based projects: MMCls 1.x, MMDet 3.x, MMDet3d 1.x, MMSeg 1.x, MMEdit 1.x, MMOCR 1.x, MMPose 1.x, MMAction2 1.x.
Features
- Support mmaction2 (#1012)
- Support SimCC from mmpose (#1187)
- Support RTMDet from MMDet (#1104)
- Support CenterNet from MMDet (#1219)
- Support MobileOne from MMCls (#1268)
- Support external usage of MMYOLO (#1088)
Improvements
- Update dockerfiles (#1296)
Bug fixes
Document
Contributors
@xin-li-67 @liu-mengyang @doufengqi @PeterH0323 @triple-Mu @MambaWong @isLinXu @francis0407 @sanbuphy @vansin @SsTtOoNnEe @RangiLyu @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime
MMDeploy Release V0.11.0
Features
- Support MMaction2
TSN
andSlowFast
deployment with ONNXRuntime and TensorRT (#1183,#1410,#1455) - Support Rockchip device
RV1126
- Add SDK profiler (#1274)
- Support end2end deployment for pointpillars & centerpoint(pillar)from MMDet3d (#1178)
Improvements
- Support loading TensorRT libnvinfer plugins (#1275)
- Avoid copying dense arrays in SDK C API and Python API (#1261, #1349)
- Add Core ML common configuration (#1308)
- Refactor SDK registry (#1368)
- Update regresssion test to serialize eval result into json (#1310)
- Support onnxruntime-1.13 API(#1407)
- Decouple preprocess operation and transformation (#1353)
Bug fixes
- Set stream argument when using async memcpy (#1314)
- Use OpenCV with
videoio
enabled for aarch64 platform (#1343) - Fix(tools/scripts): find env file failed (#1385)
- Fix ncnn-int8 config path (#1380)
- Fix out-of-boundary issue in SDK when
topk
is larger thanclass_num
(#1420) - Fix yolohead trt8.2 (#1433)
- Fix
pad_to_square
(#1436) - Fix
det_pose
demo (#1419) - Relax module adapter template constraints (#1366)
- Fix ncnn torch 1.12 master (#1430)
- Avoid gpu topk const-fold (#1439)
- Support .NET Framwork 4.8 and fix batch inference error(#1370)
- Upgrade ncnn to
20221128
to resolve build error (#1459)
Document
- Add more images for demos and user guides (#1339)
- Improve mmdet3d doc (#1394)
- Display CI results in README (#1452)
- Fix dead links in
write_config.md
(#1396)
Contributors
@xin-li-67 @sunjiahao1999 @francis0407 @Typiqally @triple-Mu @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime
MMDeploy Release V0.10.0
Features
- Support Monocular 3D Detection and FCOS3D Deployment (#1047)
- Support MMEdit
EDSR
deployment with ncnn-int8 (#1111) - Rewrite Conv2dAdaptiveOps to support EfficientNet deployment (#1045)
- Add installation scripts for Jetson Orin (#1105)
- Support aarch64 cross compiler (#1126)
Improvements
- Support Fast-SCNN deployment with ncnn backend (#1094)
- Ease rewriter import (#1166)
- Support TensorRT 8.4 (#1144)
- Remove extra domains after model extraction (#1207)
- Add batch inference demos (#986)
- update symbolic rewriter for latest PyTorch API (#1122)
- Detect filesystem library in CMake (#1190)
- compute per sample statistics when profiling in batch mode (#1158)
- Add a device field for
mmdeploy_mat_t
(#1176)
Before v0.10.0
typedef struct mmdeploy_mat_t {
uint8_t* data;
int height;
int width;
int channel;
mmdeploy_pixel_format_t format;
mmdeploy_data_type_t type;
} mmdeploy_mat_t;
in v0.10.0
typedef struct mmdeploy_mat_t {
uint8_t* data;
int height;
int width;
int channel;
mmdeploy_pixel_format_t format;
mmdeploy_data_type_t type;
mmdeploy_device_t device;
} mmdeploy_mat_t;
Bug fixes
- Fix
test_windows_onnxruntime
workflow error in circleci (#1254) - Fix build error when the target device is 'cuda' and the inference backend is 'onnxruntime-gpu' (#1253)
- Fix
layer_norm
symbol error when exporting it with torch>=1.12 (#1168) - Fix regression test script errors (#1217, #1146)
Document
- Update supported backend logos in the cover of README (#1252)
- Add a link to MMYOLO in README (#1235)
Contributors
@doufengqi @Qingrenn @liu-mengyang @SsTtOoNnEe @OldDreamInWind @sunjiahao1999 @LiuYi-Up @isLinXu @lansfair @lvhan028 @grimoire @AllentDan @RunningLeon @lzhangzz @tpoisonooo @hanrui1sensetime
MMDeploy Release V0.9.0
Features
- Add Rust API for mmdeploy SDK. Project: https://github.com/liu-mengyang/rust-mmdeploy
- Support MMOCR TextSnake and MMPose Hourglass model deployment with ncnn-int8 (#1074, #1064, #1066)
- Rewrite
torch.Tensor.__mod__
to support TensorRT (#1024)
Improvements
- Separate C++ API demos from C API demos (#1099)
- Refactor SDK pipeline (#938)
- Check upstream libopencv-dev version before adding apt repository (#1068)
- Make inference still available on headless device (#1041)
- Validate installation in building scripts (#1036)
Bug fixes
- Set
size_divisor
ofPad
transform to1
for static shape model. (#1049) - Fix
LayerNorm
shape issue when exporting to onnx withtorch <= 1.12
(#1015) - Fix calibration error when converting model to TensorRT-int8 (#1050)
- Synchronize cuda stream after inference with onnxruntime-gpu (#1053)
- Add
GatherTopk
TensorRT plugin as a workaround to fix dynamic shape issue (#1033) - Fix
RoiAlignFunction
error for CoreML (#1029) - Resolve two-stage detector deployment error with CoreML (#1044)
- Fix two-stage detector TensorRT deployment error with dynamic shape (#1046)
Document
- Update supported backends table in README (#1109)
- Correct examples in tutorial -
how to develop TensorRT plugin
(#1021) - Fix broken links and typos (#1078, #1025, #1061)
Contributors
@liu-mengyang @BrokenArrow1404 @jinwonkim93 @Qingrenn @JingweiZhang12 @ichitaka @Typiqally @lvhan028 @irexyc @tpoisonooo @lzhangzz @grimoire @AllentDan @hanrui1sensetime
MMDeploy Release V0.8.0
Highlight
- Support more platforms and devices:
RISC-V
,Apple M1
,Huawei Ascend310
andRockchip RK3588
Features
- Support more models on ONNX Runtime and TensorRT
- Support more platforms and devices:
- Add
TorchScript
SDK inference backend (#890) - Experimental support for fusing transformations in preprocess pipeline by CVFusion (#741)
Improvements
-
Support multi-label classification in SDK (#950)
-
Add the following scripts to simplify mmdeploy installation for some scenarios: (#919)
script OS version build_ubuntu_x64_ncnn.py 18.04/20.04 build_ubuntu_x64_ort.py 18.04/20.04 build_ubuntu_x64_pplnn.py 18.04/20.04 build_ubuntu_x64_torchscript.py 18.04/20.04 -
Add scaled dot-product attention operator for TensorRT (#949)
-
Support model batch inference profiling (#868)
# profile the latency of resnet18-tensorrt model with batch size 4
python tools/profiler.py \
configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py \
../mmclassification/configs/resnet/resnet18_8xb32_in1k.py \
{/the/path/of/an/image/directory} \
--model {work-dirs}/mmcls/resnet/trt/end2end.engine \
--device cuda \
--shape 224x224 \
--num-iter 100 \
--warmup 10 \
--batch-size 4
Bug fixes
- Fix CI errors (#985, #983, #977, #987, #966, #945)
- Fix missing
sqrt
inPAAHead
(#984) - Fix
nms_rotated
logic when no bbox is detected (#976) - Fix rewrite for
torch.Tensor.__setitem__
in some corner cases (#964, #941) - Disable ONNX optimizer when converting model to ncnn (#961)
- Fix regression test (#958)
- Disable cublaslt for CUDA 10.2 (#947)
- Stop sorting dataset by default & set
test_mode
for mmdet pipelines (#920) - Resolve the issue (#909) -
ValueError: cpu is invalid for the backend tensorrt.
when exporting SDK meta info (#912) - Validate the device id when the inference backend is TensorRT or OpenVINO (#886)
- Fix mmdeploy_pplnn_net build error when target device is CPU (#896)
- Replace
adaptive_avg_pool2d
withavg_pool2d
to support exporting ONNX with dynamic shape (#857)
Document
- Clarify arguments in model conversion (#956, #940)
- Add tutorial in Chinese about "How to write a customized TensorRT plugin" (#290)
- Keep cmake build option in a separate document cmake_option. (#832)
- Add project architecture (#882)
- Sync Enginsh and Chinese documents (#842)
- Correct build-demo commands in prebuilt_package_windows.md (#879)
- Fix the wrong argument in model quantization document (#866)
Known issues
DETR
deployment failed both via ONNX Runtime and TensorRT (#1011, pytorch 84563)
Contributors
@OldDreamInWind @liu-mengyang @gy-7 @Groexhy @munhou @miraclezqc @VVsssssk @hanrui1sensetime @tpoisonooo @grimoire @irexyc @RunningLeon @AllentDan @lzhangzz @lvhan028
MMDeploy Release V0.7.0
Highlight
- Support SNPE (#789)
- Please refer to Build for SNPE to get start SNPE deployment journey
- Add C++ API for SDK (#831)
Features
- Support SNPE (#789)
- Add C++ API for SDK (#831)
- Support MMRotate model with le135 angle format (#788)
- Support RoI Transformer and
Gliding Vertex
model deployment fromMMRotate
(#713, #650) - Add inference latency test script
tools/profile.py
(#655)
Here is an example to profileTensorRT_fp32-resnet18
inference latency:
python tools/profile.py \
configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py \
../mmclassification/configs/resnet/resnet18_8xb32_in1k.py \
../mmdetection/demo
--model work-dirs/mmcls/resnet/trt/end2end.engine \
--device cuda \
--shape 224x224 \
--num-iter 100 \
--warmup 10
Improvements
- Optimize prebuilt process for Python SDK (#810)
- Upgrade
ppl.nn
andppl.cv
tov0.8.1
andv0.7.0
respectively (#793, #564) - Support batch image test in test script
test.py
(#829) - Install onnx optimizer by setuptools instead of cmake build (#690, #811, #843)
- Add SDK code coverage (#808)
- Support kwargs in SDK Python bindings (#794, #844, #852)
- Support building SDK into a single library by enabling
MMDEPLOY_BUILD_SDK_MONOLITHIC
(#806) - Add a new option
MMDEPLOY_BUILD_EXAMPLES
to build and install SDK examples (#822) - Reduce log verbosity and improve error reporting (#755)
- Upgrade GPU Dockerfile to use TensorRT 8.2.4.2 (#706)
- Optimize ONNX graph
- [BC Breaking] Standardize C API(#634)
- Rename
onnx2ncnn
tommdeploy_onnx2ncnn
(#694)
Bug fixes
- Fix build error on macOS platform (#762)
- Fix
troch.triu
function rewriter error when exporting to onnx (#792) - Resolve Cascade R-CNN,
YOLOX
andSATRN
deployment failure (#787, #758, #753) - Fix
check_env.py
about checking whether custom ops are available (#785) - Fix export for TopK operator in PyTorch 1.12 (#715)
Fix export for padding operators in PyTorch<1.10 (#754) - Add default
topk
in SDK model meta info when it is not explicitly specified inmmclassifcation
model configs (#702) - Fix SingleRoIExtractor for TorchScript backend (#724)
- Fix export for DistancePointBBoxCoder.decode (#687)
- Fix wrong backend type when doing calibration (#719)
- Set exit code to 1 when error happens (#715)
- Fix build error on android platform (#698)
- Pass
img_metas
while exporting to onnx (#681, #700, #707)
Document
- Update build document for android platform (#817)
- Fix rendering issues of
get_started
documents in readthedocs (#740) - Add prebuilt package usage on Windows platform (#816)
- Simplify
get_started
guide (#813)
Contributors
@nijkah @dwSun @lvhan028 @lzhangzz @irexyc @RunningLeon @grimoire @tpoisonooo @AllentDan @hanrui1sensetime
MMDeploy Release V0.6.0
Highlight
- Support Swin Transformer deployment with TensorRT and ONNX Runtime (#652)
- Support Segmenter deployment with all backends (#587)
- Add Java API for SDK (#563)
Features
- Support Swin Transformer deployment with TensorRT and ONNX Runtime (#652)
- Add Java API for SDK (#563)
- Support
Segmenter
deployment with all backends (#587) - Support two-stage rotated detector deployment with TensorRT (#530)
Improvements
- Add onnx pass to fuse
select-assign
graph pattern (#589) - Add more CircleCI workflows on Linux, Windows and Linux-GPU platforms (#368)
- Add documentation and sample code for model partitioning (#599)
- Add
GridPriorsTRT
plugin to speed up TensorRT anchor generation from155us
t013us
(#646) - Add
MMDEPLOY_TASKS
variable in cmake scripts to remove duplication code (#606) - Improve ncnn patch embed (#592)
- Support compute capability 87 for Jetson Orin (#601)
- Adjust
csrc
structure (#594)
Bug fixes
- Add
build
to TensorRT plugin candidate path list (#672) - Fix missing "image shape" when exporting mmpose models (#667)
- Fix ncnn unittest error (#626)
- Fix bugs when deploying ShuffleNetV2 with TensorRT (#645)
- Relax
mmcls
version constraint (#653) - Eliminate illegal memory access for object detector C# API (#613)
- Add dim param for
Tensor::Squeeze
(#603) - Fix link missed issue in
index.rst
(#607) - Add support for MMOCR 0.5+ (#604)
- Fix output tensor shape of ncnn backend (#605)
Documentation
- Fix errors and typos in user documents (#676, #675, #655, #654, #621, #588, #586)
- Update deployment benchmark for ViT (#624)
- Replace markdown lint with
mdformat
and configuremyst-parser
(#610)
Contributors
@zambranohally @bgsuello @triple-Mu @DrRyanHuang @liuqc11 @Yosshi999 @zytx121 @RunningLeon @AllentDan @lzhangzz @irexyc @grimoire @lvhan028 @hanrui1sensetime @tpoisonooo
MMDeploy Release V0.5.0
Highlight
- Provide prebuilt packages since v0.5.0
- Decouple
pytorch2onnx
andonnx2backends
- Support text detection models PANet, PSENet and DBNet, with CUDA accelerated postprocessing in SDK
- Support MMRotate
Features
- Add prebuild tools (#545, #347)
- Experimental executor support in SDK (#497)
- Support ViT on ncnn (#477, #403)
- Support LiteHRNet on ncnn (#316)
- Support more text detection models PANet, PSENet and DBNet, with CUDA accelerated postprocessing in SDK (#446, #526, #534)
- Add C# API for SDK (#388, #535)
- Support ncnn quantization (#476)
- Support RepPoints on TensorRT (#457)
- Support MMRotate on ONNX Runtime and TensorRT (#277, #312, #422, #450, #428, #473)
- Support MMRazor (#220, #467)
Improvements
- Remove
spdlog
manual installation but still keep it as an option (#423, #544)
Users can turn on the following option to use the external spdlog
cmake .. -DMMDEPLOY_SPDLOG_EXTERNAL=ON
- Add SDK python demos (#554)
- Add ONNX passes support (#390)
- Decouple
pytorch2onnx
andonnx2backends
(#529, #540) - Add scripts and configs to test metrics of deployed model with all inference backend (#425, #302, #551, #542)
- Support MDCN and DeformConv TensorRT FP16 (#503, #468)
- Add interactive build script for Linux and NVIDIA platform (#399)
- Optimize global average pooling when exporting ONNX (#478)
- Refactor
onnx2ncnn
, add test cases and simplify code (#436) - Remove
expand
operation from mmdet rewrite (#371)
Bug fixes
- Update CMake scripts to fix building problems (#544, #553)
- Make ONNXRuntime wrapper work both for cpu and cuda execution (#438, #532)
- Fix PSPNet-TorchScript conversion error (#538)
- Resolve the incompatible issue when upgrading MMPose from v0.25.0 to v0.26.0 (#518, #527)
- Fix mismatched device issue when testing Mask R-CNN deployed model (#511)
- Remove redundant
resize
in mmseg EncoderDecoder rewrite (#480) - Fix display bugs on headless devices (#451)
- Fix MMDet3D
pillarencode
deployment failure (#331) - Make the latest
spdlog
compatible (#423) - Fix CI (#462, #447, #440, #426, #441)
- Fix a bug that causes exporting to onnx failed with static shape and batch size > 1 (#501)
- Make
--work-dir
default to$pwd
intools/deploy.py
(#483)
Documentation
- Fix user document errors, reorganize them, update REAME and rewrite the GET_STARTED chapters (#418, #482, #509, #531, #547, #543)
- Rewrite the get_started for Jetson platforms (#484, #449, #415, #381)
- Fix APIs rendering failure in readthedocs (#443)
- Remove '' in API docstring (#495)
- More tutorials in Chinese are checked in - Tutorial 05: ONNX Model Editing and Tutorial 04: onnx custom op (#508, #517)
Contributors
@sanjaypavo @PeterH0323 @tehkillerbee @zytx121 @triple-Mu @zhiqwang @gyf304 @lakshanthad @Dchaoqun @zhouzaida @NagatoYuki0943 @VVsssssk @irexyc @RunningLeon @hanrui1sensetime @lzhangzz @grimoire @tpoisonooo @AllentDan @SingleZombie
MMDeploy Release V0.4.1
Improvements
- Add IPython notebook tutorial (#234)
- Support detecting TensorRT from
CUDA_TOOLKIT_ROOT_DIR
(#357) - Build onnxruntime backend in GPU dockerfile (#366)
- Add CircleCI workflow for linting (#348)
- Support saving results when testing the deployed model of MMEdit (#336)
- Support GPU postprocessing for instance segmentation (#276)
Bug fixes
- Make empty bounding box list allowed in text recognizer and pose detector C API (#310, #396)
- Fix the logic of extracting model name from config (#394)
- Fix feature test for std::source_location (#416)
- Add missing codegen for
sm_53
to support Jetson Nano (#407) - Fix crash caused by accessing the wrong tensor in segmentor C API (#363)
- Fix reading mat type from the wrong image in a batch (#362)
- Fix missing binary flag when saving temp OpenVINO model (#353)
- Fix Windows build for pose demo (#307)
Documents
- Refine documents by fixing typos, correcting build commands, and removing redundant doc tree (#352, #360, #378, #398)
- Add a tutorial about torch2onnx in Chinese (#365)
Contributors
@irexyc @VVsssssk @AllentDan @lzhangzz @PeterH0323 @RunningLeon @zly19540609 @triple-Mu @grimoire @hanrui1sensetime @SingleZombie @Adenialzz @tpoisonooo @lvhan028 @xizi