Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC Add ONNX Support for GatherElements #24092

Merged

Conversation

Aser-Abdelfatah
Copy link
Contributor

@Aser-Abdelfatah Aser-Abdelfatah commented Aug 2, 2023

Merge with: opencv/opencv_extra#1082

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Adds support to the ONNX operator GatherElements operator docs
Added tests to opencv_extra at pull request opencv/opencv_extra#1082

Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! Please read reviewers' comments and give responses.

modules/dnn/src/layers/gather_elements_layer.cpp Outdated Show resolved Hide resolved
modules/dnn/src/layers/gather_elements_layer.cpp Outdated Show resolved Hide resolved
@dkurt
Copy link
Member

dkurt commented Aug 3, 2023

@Aser-Abdelfatah , what about PR #23975? Can be closed?

@fengyuentau fengyuentau added category: dnn category: dnn (onnx) ONNX suport issues in DNN module labels Aug 4, 2023
Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other looks good to me! 👍 Good job!

modules/dnn/src/onnx/onnx_importer.cpp Show resolved Hide resolved
modules/dnn/src/layers/gather_elements_layer.cpp Outdated Show resolved Hide resolved
@fengyuentau fengyuentau self-assigned this Aug 25, 2023
@fengyuentau fengyuentau added this to the 4.9.0 milestone Aug 25, 2023
@fengyuentau
Copy link
Member

Hello @Aser-Abdelfatah , could you solve the patch size limitation problem with using existing images from https://github.com/opencv/opencv_extra/tree/4.x/testdata/gpu/opticalflow?

@Aser-Abdelfatah
Copy link
Contributor Author

Hi,
Sorry for getting back late to this. I was looking for a way to convert the pre-processed image data to npy file using C++. In this OpenCV forum thread that I opened https://forum.opencv.org/t/how-to-export-an-opencv-mat-to-npy-pb-file-in-c/14464/4, it's mentioned that the size of the data will be the same, which makes perfect sense, since we are storing the same data in the same format.

@dkurt
Copy link
Member

dkurt commented Sep 19, 2023

@Aser-Abdelfatah , there is no need to create npy in C++. Inputs are images so use imread and blobFromImage as mentioned here: opencv/opencv_extra#1082 (comment)

@fengyuentau
Copy link
Member

I think @Aser-Abdelfatah is going to reuse testONNXModels right? In case of using real images as inputs, I think you should not use testONNXModels, but can refer to (skip the part of creating reference data)

TEST_P(Test_ONNX_layers, YOLOv7)
{
std::string weightPath = _tf("models/yolov7_not_simplified.onnx", false);
std::string imgPath = _tf("../dog_orig_size.png");
Size targetSize{640, 640};
float conf_threshold = 0.3;
float iou_threshold = 0.5;
// Reference, which is collected with input size of 640x640
std::vector<int> refClassIds{1, 16, 7};
std::vector<float> refScores{0.9614331f, 0.9589417f, 0.8679074f};
// [x1, y1, x2, y2] x 3
std::vector<Rect2d> refBoxes{Rect2d(105.973236f, 150.16716f, 472.59012f, 466.48834f),
Rect2d(109.97953f, 246.17862f, 259.83676f, 600.76624f),
Rect2d(385.96185f, 83.02809f, 576.07355f, 189.82793f)};
Mat img = imread(imgPath);
Mat inp = blobFromImage(img, 1/255.0, targetSize, Scalar(0, 0, 0), true, false);
Net net = readNet(weightPath);
net.setInput(inp);
std::vector<Mat> outs;
net.forward(outs, net.getUnconnectedOutLayersNames());
. You will need to run with ONNX Runtime to save reference output first, then load to ref mat with blobFromNPY.

const T* tmp_p_data = p_data + inp_offset;
T* tmp_p_out = p_out + ind_offset;
for (size_t i = 0; i < total; i++) {
*tmp_p_out = *tmp_p_data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a bug here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved.

@fengyuentau
Copy link
Member

fengyuentau commented Sep 29, 2023

The raft model has a GatherElements operator which has a non-constant indices input. We need it constant so as to allocate memory in pre-run stage. This operator is in between conv layers, most probably not pre- or post-processing and cannot be removed. It may work even with non-constant indices if its shape is known.

net.setInput(blob0, "0");
net.setInput(blob1, "1");
auto out0 = net.forward("12007");
auto out1 = net.forward("12006");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forward model only once and request both outputs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

std::string img0_path = findDataFile(std::string("gpu/opticalflow/frame0.png"));
std::string img1_path = findDataFile(std::string("gpu/opticalflow/frame1.png"));

Size target_size{480, 360};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fengyuentau, can we reduce input resolution as minimul as possible to cover required checks for a new layer? 240x180 or lower.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be feasible, but the model may be sensitive to the input shape. The intention was we use the model in opencv_zoo for accuracy or performance testing. If the model is shape-sensitive in the end, we have to replace the model in the zoo with one with a smaller input shape. That could be some trouble.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please reduce precision to FP16? Some changes in blobFromNPY will be required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will handle this problem after we can run this model. Currently we are facing some other issues running this model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just check the smaller output? This model has two outputs, one of which has shape [1, 2, 45, 60] and file size 21KB, the other of which has shape [1, 2, 360, 480] and file size 1.32MB. The bigger output is calculated from the smaller output. I think it also makes sense if we just check the smaller one. In this case, we can avoid replacing models.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally dnn can achieve the same accuracy with #24386 (greater fix), with or without #24378 (instancenorm).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RAFT model takes nearly 2s for a single inference on my mac m1. I think we still need to replace with a smaller input scale version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with a smaller input one but the time is almost the same. Lets keep what we have. Also I found that we have almost the same inference time compared to ONNX Runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add a tag to this test applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_VERYLONG, CV_TEST_TAG_MEMORY_2GB); to skip as long. PR is ready and might be merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add a tag to this test applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_VERYLONG, CV_TEST_TAG_MEMORY_2GB); to skip as long. PR is ready and might be merged.

Thank you for the tips. I almost forgot these flags!

}

template<typename T>
void forward_impl(const Mat& data, const Mat& indices, Mat& out)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to refactor this implementation because it is just tooooooo slow, leading to a very long inference with the RAFT model (it has 160!!! GatherElements operators). Perf with GatherElements(data:FLOAT[2700, 1, 2914], indices:INT64[2700, 1, 81]):
ort_benchmark.zip

this impl ort
15415.07 ms 0.15ms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvement:

Previous impl by Aser: 15415.07 ms
After removing meaningless for loop: 1ms
Speedup: 0.49ms
ort: 0.15 ms

asmorkalov pushed a commit that referenced this pull request Oct 13, 2023
dnn: fix inconsistent input dtype for nary eltwise layers #24386

Resolves #24385
Merge with opencv/opencv_extra#1107
Relates #24092 (comment)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
@asmorkalov
Copy link
Contributor

@vpisarev @dkurt Could you take a look on the last changes?

@fengyuentau fengyuentau force-pushed the GSoC_Support_GatherElements_ONNX branch from 47901d3 to ac3ea42 Compare October 13, 2023 09:46
Signed-off-by: Yuantao Feng <yuantao.feng@opencv.org.cn>
asmorkalov pushed a commit to opencv/opencv_extra that referenced this pull request Oct 17, 2023
…ents_ONNX

Merge with opencv/opencv#24092

This pull request serves as an addition to the pull request "GSoC Add ONNX Support for GatherElements #24092", which is submitted to the original OpenCV repository. It contains three tests to the GatherElements operator. 
The three tests are sourced from:
* https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_0
* https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_1
* https://github.com/onnx/onnx/tree/main/onnx/backend/test/data/node/test_gather_elements_negative_indices
And the models were configured using the code provided by Yuantao Feng at https://github.com/fengyuentau/onnx_utils to confirm output shape in pre-run.

---

Reference outputs for `optical_flow_estimation_raft_2023aug.onnx` is generated with the following script:

```python
import cv2 as cv
import numpy as np

img1 = cv.imread("./opencv_extra/testdata/gpu/opticalflow/frame0.png")
img2 = cv.imread("./opencv_extra/testdata/gpu/opticalflow/frame1.png")

img1_blob = cv.cvtColor(img1, cv.COLOR_BGR2RGB)
img1_blob = cv.resize(img1_blob, (480, 360))
# source: (1, 3, 388, 584)
# target: 1, 3, 360, 480
img1_blob = cv.dnn.blobFromImage(img1_blob)
# source:  (1, 3, 388, 584)
# target: 1, 3, 360, 480
img2_blob = cv.cvtColor(img2, cv.COLOR_BGR2RGB)
img2_blob = cv.resize(img2_blob, (480, 360))
img2_blob = cv.dnn.blobFromImage(img2_blob)
print(img1_blob.shape, img2_blob.shape)

import onnxruntime as ort
net = ort.InferenceSession("optical_flow_estimation_raft_2023aug.onnx", providers=["CPUExecutionProvider"])
output = net.run(["12007", "12006"], {"0": img1_blob, "1": img2_blob})
np.save("output_optical_flow_estimation_raft_2023aug_0.npy", output[0])
np.save("output_optical_flow_estimation_raft_2023aug_1.npy", output[1])
print(output[0].shape)
print(output[1].shape)
```
@asmorkalov
Copy link
Contributor

@opencv-alalek please update models on Buildbot.

@asmorkalov asmorkalov merged commit 240b245 into opencv:4.x Oct 18, 2023
24 of 26 checks passed
CV_TRACE_FUNCTION();
CV_TRACE_ARG_VALUE(name, "name", name.c_str());

std::vector<Mat> inputs, outputs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation should handle FP16 datatype too

[  FAILED  ] 3 tests, listed below:
[  FAILED  ] Test_ONNX_conformance.Layer_Test/test_gather_elements_0_OCV_OCL_FP16, where GetParam() = (test_gather_elements_0, OCV/OCL_FP16)
[  FAILED  ] Test_ONNX_conformance.Layer_Test/test_gather_elements_1_OCV_OCL_FP16, where GetParam() = (test_gather_elements_1, OCV/OCL_FP16)
[  FAILED  ] Test_ONNX_conformance.Layer_Test/test_gather_elements_negative_indices_OCV_OCL_FP16, where GetParam() = (test_gather_elements_negative_indices, OCV/OCL_FP16)

https://pullrequest.opencv.org/buildbot/builders/4_x-lin64/builds/100283

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #24427

@opencv-alalek
Copy link
Contributor

@asmorkalov asmorkalov mentioned this pull request Nov 3, 2023
asmorkalov pushed a commit that referenced this pull request Nov 7, 2023
dnn onnx: add instance norm layer #24378

Resolves #24377
Relates #24092 (comment)

| Perf | multi-thread | single-thread |
| - | - | - |
| x: [2, 64, 180, 240] | 3.95ms | 11.12ms |

Todo:

- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN (will be done via #24462)


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake

```
force_builders=Linux OpenCL,Win64 OpenCL,Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu:18.04
modules_filter:Custom=none
disable_ipp:Custom=ON
```
IskXCr pushed a commit to Haosonn/opencv that referenced this pull request Dec 20, 2023
dnn: fix inconsistent input dtype for nary eltwise layers opencv#24386

Resolves opencv#24385
Merge with opencv/opencv_extra#1107
Relates opencv#24092 (comment)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
IskXCr pushed a commit to Haosonn/opencv that referenced this pull request Dec 20, 2023
…herElements_ONNX

GSoC Add ONNX Support for GatherElements opencv#24092

Merge with: opencv/opencv_extra#1082
Adds support to the ONNX operator GatherElements [operator docs](https://github.com/onnx/onnx/blob/main/docs/Operators.md#GatherElements)
Added tests to opencv_extra at pull request opencv/opencv_extra#1082

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
IskXCr pushed a commit to Haosonn/opencv that referenced this pull request Dec 20, 2023
dnn onnx: add instance norm layer opencv#24378

Resolves opencv#24377
Relates opencv#24092 (comment)

| Perf | multi-thread | single-thread |
| - | - | - |
| x: [2, 64, 180, 240] | 3.95ms | 11.12ms |

Todo:

- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN (will be done via opencv#24462)


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake

```
force_builders=Linux OpenCL,Win64 OpenCL,Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu:18.04
modules_filter:Custom=none
disable_ipp:Custom=ON
```
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
dnn: fix inconsistent input dtype for nary eltwise layers opencv#24386

Resolves opencv#24385
Merge with opencv/opencv_extra#1107
Relates opencv#24092 (comment)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
…herElements_ONNX

GSoC Add ONNX Support for GatherElements opencv#24092

Merge with: opencv/opencv_extra#1082
Adds support to the ONNX operator GatherElements [operator docs](https://github.com/onnx/onnx/blob/main/docs/Operators.md#GatherElements)
Added tests to opencv_extra at pull request opencv/opencv_extra#1082

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
dnn onnx: add instance norm layer opencv#24378

Resolves opencv#24377
Relates opencv#24092 (comment)

| Perf | multi-thread | single-thread |
| - | - | - |
| x: [2, 64, 180, 240] | 3.95ms | 11.12ms |

Todo:

- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN (will be done via opencv#24462)


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake

```
force_builders=Linux OpenCL,Win64 OpenCL,Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu:18.04
modules_filter:Custom=none
disable_ipp:Custom=ON
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: dnn (onnx) ONNX suport issues in DNN module category: dnn GSoC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants