-
-
Notifications
You must be signed in to change notification settings - Fork 55.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Loss from OpenCV 4.5.5 to 4.7.0 using CUDA backend #23278
Comments
Hi @cesarpgouveia, thanks for your details performance test. Please try to use the Duplicate ##23234 |
So I just built with OpenCV 4.x (4.7.0-dev) and this is the updated table. Test 3
ERROR: terminate called after throwing an instance of 'cv::Exception' These are the architectures from both Model 2 and 4 obtained from Netron: So, Model 1, 2, and 3 are crashing now (on branch 4.7.0-dev: 4.x) and although Model 4 has now a better performance than release 4.7.0 (from 52.2 to 34.7), his performance (in terms of execution time) is not better than release 4.5.5 (from 19.9 to 34.7). Do you know why models 1-3 are now crashing (they were working perfectly fine in 4.5.5 and 4.7.0 releases), is there an issue with a certain layer? |
Thanks for your report! This information is important for us. Could you paste your models? I will test each layer in few days. |
Sorry for the late response @WanliZhong, here they are: |
@cesarpgouveia I run 2023-04-19 13:23:11.0813207 [E:onnxruntime:, sequential_executor.cc:368 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running PRelu node. Name:'conv_1_relu' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/math/element_wise_ops.h:503 onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 56 by 64
Traceback (most recent call last):
File "c:\Users\Zoom\Desktop\New folder\test.py", line 13, in <module>
outputs = ort_sess.run(None, {'data': input})
File "C:\Software\miniconda3\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 200, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running PRelu node. Name:'conv_1_relu' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/math/element_wise_ops.h:503 onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 56 by 64 the code is import onnxruntime as ort
import onnx
model_path = "model1.onnx"
input = np.random.rand(1, 3, 112, 112).astype(np.float32)
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)
ort_sess = ort.InferenceSession(model_path)
outputs = ort_sess.run(None, {'data': input}) |
@cesarpgouveia After testing, |
This issue is confirmed that |
can you show the details performance test layer by layer? |
test with the latest OpenCV dev version
|
System Information
OpenCV versions tested: 4.5.5, 4.7.0
Operating System / Platform: Ubuntu 18.04
Device: NVIDIA Jetson TX2 DevKit
CUDA version: 10.2
CUDNN version: 8.2.1
Detailed description
Hi,
I was using OpenCV 4.5.5, backend CUDA on a NVIDIA Jetson TX2 Devkit with the specs defined above. A couple of days I decided to update to OpenCV 4.7.0 to check if I had some boost in performance for the models I'm currently using. However what I did saw was a performance loss (in terms of execution time) for the majority of the models. Do you know what is the reason for this loss of performance?
This is the execution times obtained for both versions of OpenCV:
Test 1
Test 2
Note: Both tests were built with the same OpenCV flags and requirements, the only thing that changed was the version of both opencv and opencv_contrib. Moreover, all the execution times presented in those tables are in ms.
Steps to reproduce
You can use this piece of code to reproduce this issue/loss of performance:
Issue submission checklist
The text was updated successfully, but these errors were encountered: