Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLOv9 with End2End ( Efficient NMS) #130

Closed
levipereira opened this issue Feb 29, 2024 · 15 comments
Closed

YOLOv9 with End2End ( Efficient NMS) #130

levipereira opened this issue Feb 29, 2024 · 15 comments

Comments

@levipereira
Copy link

levipereira commented Feb 29, 2024

Thank you for your wonderful work!

YOLOv9 with End2End ( Efficient NMS)
Note: The primary purpose of employing End2End is to utilize ONNX models on TensorRT. If you choose not to use TensorRT, you should proceed with the standard ONNX export process.

I've created a forked repository from the original, adding End-to-End support for ONNX export. The changes can be found in export.py and models/experimental.py. Both files remain fully compatible with all current export operations.
Check it out at https://github.com/levipereira/yolov9

  • Support for End-to-End ONNX Export: Added support for end-to-end ONNX export in export.py and models/experimental.py.

  • Model Compatibility: This functionality currently works with all DetectionModel models ;

  • Configuration Variables: Use the following flags to configure the model:

    • --include onnx_end2end: Enabled export End2End
    • --simplify: ONNX/ONNX END2END: Simplify model.
    • --topk-all: ONNX END2END/TF.js NMS: Top-k for all classes to keep (default: 100).
    • --iou-thres: ONNX END2END/TF.js NMS: IoU threshold (default: 0.45).
    • --conf-thres: ONNX END2END/TF.js NMS: Confidence threshold (default: 0.25).

Example:

$ python3 export.py --weights ./yolov9-c.pt --imgsz 640 --simplify --include onnx_end2end

export: data=data/coco.yaml, weights=['./yolov9-c.pt'], imgsz=[640], batch_size=1, device=cpu, half=False, inplace=False, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx_end2end']
YOLOv5 🚀 v0.1-27-g86b0667 Python-3.8.10 torch-1.14.0a0+44dac51 CPU

Fusing layers...
Model summary: 604 layers, 50880768 parameters, 0 gradients, 237.6 GFLOPs

PyTorch: starting from ./yolov9-c.pt with output shape (1, 84, 8400) (98.4 MB)

ONNX END2END: starting export with onnx 1.13.0...
/yolov9/models/experimental.py:102: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality..
  out = g.op("TRT::EfficientNMS_TRT",
[W shape_type_inference.cpp:1913] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
[W shape_type_inference.cpp:1913] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
[W shape_type_inference.cpp:1913] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
[W shape_type_inference.cpp:1913] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
========== Diagnostic Run torch.onnx.export version 1.14.0a0+44dac51 ===========
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 4 WARNING 0 ERROR ========================
4 WARNING were not printed due to the log level.


Starting to simplify ONNX...
ONNX export success, saved as ./yolov9-c_end2end.onnx
ONNX END2END: export success ✅ 11.5s, saved as ./yolov9-c_end2end.onnx (129.3 MB)

Export complete (13.6s)
Results saved to /yolov9/experiments/models
Visualize:       https://netron.app
@WongKinYiu
Copy link
Owner

Added to readme.

@WongKinYiu
Copy link
Owner

https://github.com/levipereira/yolov9/blob/main/models/experimental.py#L140

Here may has bug.
output[0] is prediction of aux branch, output[1] is prediction of main branch.

@levipereira
Copy link
Author

Thanks. Fixed the issue was set output[1] as the prediction for the main branch instead of output[0].
levipereira@20f921f

@laugh12321
Copy link

laugh12321 commented Mar 3, 2024

Hello everyone!

I would like to introduce my open-source project - TensoRT-YOLO, a tool for deploying YOLO Series with Efficient NMS in TensorRT.

Key Features

  • Support for YOLOv3, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOv10, PP-YOLOE and PP-YOLOE+
  • Support for ONNX static and dynamic export, as well as TensorRT inference
  • Integration of EfficientNMS TensorRT plugin for accelerated post-processing
  • Utilization of CUDA kernel functions for accelerated preprocessing
  • Support for inference in both C++ and Python
  • Command-line interface for quick export and inference
  • One-click Docker deployment

Perfomance Test using GPU RTX 2080Ti 2GB on AMD Ryzen 7 5700X 8-Core/ 128GB RAM

All models are converted to ONNX models with the EfficientNMS plugin. The conversion was done using the TensoRT-YOLO tool, with the trtyolo CLI tool installed via pip install tensorrt-yolo==3.0.1. The batch size is 1 and the image size is 640.

Model Export and Performance Testing

Use the following commands to export the model and perform performance testing with trtexec:

trtyolo export -v yolov9 -w yolov9-converted.pt --imgsz 640 -o ./
trtexec --onnx=yolov9-converted.onnx --saveEngine=yolov9-converted.engine --fp16
trtexec --fp16 --avgRuns=1000 --useSpinWait --loadEngine=yolov9-converted.engine

Performance testing was conducted using the TensorRT-YOLO inference on the coco128 dataset.

YOLOv9 Series

Tool YOLOv9-T-Converted YOLOv9-S-Converted YOLOv9-M-Converted YOLOv9-C-Converted YOLOv9-E-Converted
trtexec (infer) Mean Latency (ms) 3.51857 Mean Latency (ms) 3.67899 Mean Latency (ms) 4.19460 Mean Latency (ms) 4.25964 Mean Latency (ms) 8.95429
TensorRT-YOLO Python (infer) Mean Latency (ms) 10.19576 Mean Latency (ms) 10.15226 Mean Latency (ms) 9.29918 Mean Latency (ms) 9.60093 Mean Latency (ms) 21.85042
TensorRT-YOLO C++ (pre + infer) Mean Latency (ms) 3.44162 Mean Latency (ms) 3.66080 Mean Latency (ms) 4.10519 Mean Latency (ms) 4.12471 Mean Latency (ms) 8.98964
Tool Gelan-S2 Gelan-S Gelan-M Gelan-C Gelan-E
trtexec (infer) Mean Latency (ms) 3.42082 Mean Latency (ms) 3.78578 Mean Latency (ms) 4.16447 Mean Latency (ms) 4.27485 Mean Latency (ms) 8.91479
TensorRT-YOLO Python (infer) Mean Latency (ms) 9.96435 Mean Latency (ms) 10.35934 Mean Latency (ms) 9.14044 Mean Latency (ms) 9.33843 Mean Latency (ms) 21.42764
TensorRT-YOLO C++ (pre + infer) Mean Latency (ms) 3.60857 Mean Latency (ms) 3.93528 Mean Latency (ms) 4.25084 Mean Latency (ms) 4.35533 Mean Latency (ms) 9.23654

YOLOv8 Series

Tool YOLOv8n YOLOv8s YOLOv8m YOLOv8l YOLOv8x
trtexec (infer) Mean Latency (ms) 1.90273 Mean Latency (ms) 2.34166 Mean Latency (ms) 3.58595 Mean Latency (ms) 4.83306 Mean Latency (ms) 7.12179
TensorRT-YOLO Python (infer) Mean Latency (ms) 7.03217 Mean Latency (ms) 7.52751 Mean Latency (ms) 8.75298 Mean Latency (ms) 10.56914 Mean Latency (ms) 12.45605
TensorRT-YOLO C++ (pre + infer) Mean Latency (ms) 2.02848 Mean Latency (ms) 2.15021 Mean Latency (ms) 3.57631 Mean Latency (ms) 4.78318 Mean Latency (ms) 6.96686

@radandreicristian
Copy link

Hey @levipereira, first of all thanks for this work!

The warning in the ONNX export step results in a run-time error when trying to do the inference.

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from gelan-c-end2end.onnx failed:Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

Any idea what could cause this? I could provide more code/context if needed.

@levipereira
Copy link
Author

The warning in the ONNX export step results in a run-time error when trying to do the inference.
Any idea what could cause this? I could provide more code/context if needed.

Inference with an end-to-end model using YOLOv9 source code is not supported due to lack of implementation.
Instead, utilize the triton-server repository for this purpose.
A triton-client repository will be released, enabling users to perform inference on the triton-server.

@berkgungor
Copy link

Hi, do we have to reparametrize the finetuned pt file before exporting to onnx format? Because when i perform the reparametrize python code it throws error like "AttributeError: 'DetectionModel' object has no attribute 'nc'" .

@laugh12321
Copy link

@berkgungor Hi, You can try TensoRT-YOLO, which also supports exporting onnx with Efficient NMS and does not require reparameterizing the finetuned pt.

@mdciri
Copy link

mdciri commented Mar 6, 2024

@levipereira

I exported my yolov9 model to onnx using your end2end class, but when I try to load it for inference as:

`import onnxruntime as ort

onnx_model = "./best-end2end.onnx"
session = ort.InferenceSession(onnx_model, None)
`

returns me this error:
Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from [./yolov9/runs/train/yolov9/weights/best-end2end.onnx](https://file+.vscode-resource.vscode-cdn.net/home/mdcir/Desktop/key_logo_detection/yolov9/runs/train/yolov9/weights/best-end2end.onnx) failed:Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

this are my versions:
onnx 1.15.0
onnx-graphsurgeon 0.3.27
onnxruntime-gpu 1.16.1
onnxsim 0.4.35

@levipereira
Copy link
Author

levipereira commented Mar 6, 2024

@mdciri @radandreicristian The primary purpose of employing End2End is to utilize ONNX models on TensorRT. If you choose not to use TensorRT, you should proceed with the standard ONNX export process.

Use case: https://github.com/levipereira/triton-server-yolo/tree/master

@gl94
Copy link

gl94 commented Mar 7, 2024

Hi, do we have to reparametrize the finetuned pt file before exporting to onnx format? Because when i perform the reparametrize python code it throws error like "AttributeError: 'DetectionModel' object has no attribute 'nc'" .

When you encounter the "AttributeError: 'DetectionModel' object has no attribute 'nc'" error, you can manually change "model.nc" to "your num of classes"(in my case,model.nc=4).And also in "glan-c.yaml" line 4: change "nc: 80" to "nc: your nc",line 79: change
"[[15, 18, 21], 1, DDetect, [nc]]" to "[[15, 18, 21], 1, DDetect, [your nc]]"

@PrinceP
Copy link

PrinceP commented Mar 10, 2024

Hi all,
https://github.com/PrinceP/tensorrt-cpp-for-onnx

Here's the dynamic batch version for yolov9 inference in tensorrt in C++ using the @levipereira work for dynamic support.

Any batchsize with any image size is supported here. Reference code for batching data is also present.

@berkgungor
Copy link

@gl94 i already changed class numbers in yaml, nothing has changed. Same error.

@levipereira i exported the model to onnx just using --include onnx without specifying end2end and then converted to tensorRT engine. Works fine. I also did not reparametrize since it threw error.

@mdciri
Copy link

mdciri commented Mar 12, 2024

@levipereira yes, for your implementation the End2End is to use the ONNX model on TensorRT. Anyway, I would like to convert my model to ONNX (with NMS) that works on ONNXRUNTIME.

At the moment, the export to ONNX does not take into consideration the NMS.

@gl94
Copy link

gl94 commented Mar 26, 2024

@gl94 i already changed class numbers in yaml, nothing has changed. Same error.

@levipereira i exported the model to onnx just using --include onnx without specifying end2end and then converted to tensorRT engine. Works fine. I also did not reparametrize since it threw error.

You might should also make the same change in the reparametrize python code. From "model.nc = ckpt['model'].nc" to "model.nc = your nc"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants