New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GSoC Add ONNX Support for GatherElements #24092
GSoC Add ONNX Support for GatherElements #24092
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! Please read reviewers' comments and give responses.
@Aser-Abdelfatah , what about PR #23975? Can be closed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other looks good to me! 👍 Good job!
Hello @Aser-Abdelfatah , could you solve the patch size limitation problem with using existing images from https://github.com/opencv/opencv_extra/tree/4.x/testdata/gpu/opticalflow? |
Hi, |
@Aser-Abdelfatah , there is no need to create npy in C++. Inputs are images so use imread and blobFromImage as mentioned here: opencv/opencv_extra#1082 (comment) |
I think @Aser-Abdelfatah is going to reuse opencv/modules/dnn/test/test_onnx_importer.cpp Lines 2470 to 2494 in 157b0e7
blobFromNPY .
|
const T* tmp_p_data = p_data + inp_offset; | ||
T* tmp_p_out = p_out + ind_offset; | ||
for (size_t i = 0; i < total; i++) { | ||
*tmp_p_out = *tmp_p_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like a bug here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved.
The raft model has a GatherElements operator which has a non-constant |
net.setInput(blob0, "0"); | ||
net.setInput(blob1, "1"); | ||
auto out0 = net.forward("12007"); | ||
auto out1 = net.forward("12006"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forward model only once and request both outputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
std::string img0_path = findDataFile(std::string("gpu/opticalflow/frame0.png")); | ||
std::string img1_path = findDataFile(std::string("gpu/opticalflow/frame1.png")); | ||
|
||
Size target_size{480, 360}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fengyuentau, can we reduce input resolution as minimul as possible to cover required checks for a new layer? 240x180
or lower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be feasible, but the model may be sensitive to the input shape. The intention was we use the model in opencv_zoo for accuracy or performance testing. If the model is shape-sensitive in the end, we have to replace the model in the zoo with one with a smaller input shape. That could be some trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please reduce precision to FP16? Some changes in blobFromNPY
will be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will handle this problem after we can run this model. Currently we are facing some other issues running this model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we just check the smaller output? This model has two outputs, one of which has shape [1, 2, 45, 60] and file size 21KB, the other of which has shape [1, 2, 360, 480] and file size 1.32MB. The bigger output is calculated from the smaller output. I think it also makes sense if we just check the smaller one. In this case, we can avoid replacing models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RAFT model takes nearly 2s for a single inference on my mac m1. I think we still need to replace with a smaller input scale version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with a smaller input one but the time is almost the same. Lets keep what we have. Also I found that we have almost the same inference time compared to ONNX Runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add a tag to this test applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_VERYLONG, CV_TEST_TAG_MEMORY_2GB);
to skip as long. PR is ready and might be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add a tag to this test
applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_VERYLONG, CV_TEST_TAG_MEMORY_2GB);
to skip as long. PR is ready and might be merged.
Thank you for the tips. I almost forgot these flags!
} | ||
|
||
template<typename T> | ||
void forward_impl(const Mat& data, const Mat& indices, Mat& out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to refactor this implementation because it is just tooooooo slow, leading to a very long inference with the RAFT model (it has 160!!! GatherElements operators). Perf with GatherElements(data:FLOAT[2700, 1, 2914], indices:INT64[2700, 1, 81])
:
ort_benchmark.zip
this impl | ort |
---|---|
15415.07 ms | 0.15ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improvement:
Previous impl by Aser: 15415.07 ms
After removing meaningless for loop: 1ms
Speedup: 0.49ms
ort: 0.15 ms
dnn: fix inconsistent input dtype for nary eltwise layers #24386 Resolves #24385 Merge with opencv/opencv_extra#1107 Relates #24092 (comment) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
… to Test_ONNX_nets
Signed-off-by: Yuantao Feng <yuantao.feng@opencv.org.cn>
47901d3
to
ac3ea42
Compare
Signed-off-by: Yuantao Feng <yuantao.feng@opencv.org.cn>
…ents_ONNX Merge with opencv/opencv#24092 This pull request serves as an addition to the pull request "GSoC Add ONNX Support for GatherElements #24092", which is submitted to the original OpenCV repository. It contains three tests to the GatherElements operator. The three tests are sourced from: * https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_0 * https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_1 * https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_negative_indices And the models were configured using the code provided by Yuantao Feng at https://github.com/fengyuentau/onnx_utils to confirm output shape in pre-run. --- Reference outputs for `optical_flow_estimation_raft_2023aug.onnx` is generated with the following script: ```python import cv2 as cv import numpy as np img1 = cv.imread("./opencv_extra/testdata/gpu/opticalflow/frame0.png") img2 = cv.imread("./opencv_extra/testdata/gpu/opticalflow/frame1.png") img1_blob = cv.cvtColor(img1, cv.COLOR_BGR2RGB) img1_blob = cv.resize(img1_blob, (480, 360)) # source: (1, 3, 388, 584) # target: 1, 3, 360, 480 img1_blob = cv.dnn.blobFromImage(img1_blob) # source: (1, 3, 388, 584) # target: 1, 3, 360, 480 img2_blob = cv.cvtColor(img2, cv.COLOR_BGR2RGB) img2_blob = cv.resize(img2_blob, (480, 360)) img2_blob = cv.dnn.blobFromImage(img2_blob) print(img1_blob.shape, img2_blob.shape) import onnxruntime as ort net = ort.InferenceSession("optical_flow_estimation_raft_2023aug.onnx", providers=["CPUExecutionProvider"]) output = net.run(["12007", "12006"], {"0": img1_blob, "1": img2_blob}) np.save("output_optical_flow_estimation_raft_2023aug_0.npy", output[0]) np.save("output_optical_flow_estimation_raft_2023aug_1.npy", output[1]) print(output[0].shape) print(output[1].shape) ```
@opencv-alalek please update models on Buildbot. |
CV_TRACE_FUNCTION(); | ||
CV_TRACE_ARG_VALUE(name, "name", name.c_str()); | ||
|
||
std::vector<Mat> inputs, outputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation should handle FP16 datatype too
[ FAILED ] 3 tests, listed below:
[ FAILED ] Test_ONNX_conformance.Layer_Test/test_gather_elements_0_OCV_OCL_FP16, where GetParam() = (test_gather_elements_0, OCV/OCL_FP16)
[ FAILED ] Test_ONNX_conformance.Layer_Test/test_gather_elements_1_OCV_OCL_FP16, where GetParam() = (test_gather_elements_1, OCV/OCL_FP16)
[ FAILED ] Test_ONNX_conformance.Layer_Test/test_gather_elements_negative_indices_OCV_OCL_FP16, where GetParam() = (test_gather_elements_negative_indices, OCV/OCL_FP16)
https://pullrequest.opencv.org/buildbot/builders/4_x-lin64/builds/100283
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #24427
@asmorkalov Why is it merged with failed pre-commit? https://pullrequest.opencv.org/buildbot/builders/precommit_opencl_linux/builds/100255 |
dnn onnx: add instance norm layer #24378 Resolves #24377 Relates #24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via #24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```
dnn: fix inconsistent input dtype for nary eltwise layers opencv#24386 Resolves opencv#24385 Merge with opencv/opencv_extra#1107 Relates opencv#24092 (comment) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
…herElements_ONNX GSoC Add ONNX Support for GatherElements opencv#24092 Merge with: opencv/opencv_extra#1082 Adds support to the ONNX operator GatherElements [operator docs](https://github.com/onnx/onnx/blob/main/docs/Operators.md#GatherElements) Added tests to opencv_extra at pull request opencv/opencv_extra#1082 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
dnn onnx: add instance norm layer opencv#24378 Resolves opencv#24377 Relates opencv#24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via opencv#24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```
dnn: fix inconsistent input dtype for nary eltwise layers opencv#24386 Resolves opencv#24385 Merge with opencv/opencv_extra#1107 Relates opencv#24092 (comment) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
…herElements_ONNX GSoC Add ONNX Support for GatherElements opencv#24092 Merge with: opencv/opencv_extra#1082 Adds support to the ONNX operator GatherElements [operator docs](https://github.com/onnx/onnx/blob/main/docs/Operators.md#GatherElements) Added tests to opencv_extra at pull request opencv/opencv_extra#1082 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
dnn onnx: add instance norm layer opencv#24378 Resolves opencv#24377 Relates opencv#24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via opencv#24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```
Merge with: opencv/opencv_extra#1082
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.
Adds support to the ONNX operator GatherElements operator docs
Added tests to opencv_extra at pull request opencv/opencv_extra#1082