Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAPI Fluid: SIMD for DivRC kernel. #21530

Merged
merged 4 commits into from
Mar 2, 2022
Merged

Conversation

anna-khakimova
Copy link
Member

@anna-khakimova anna-khakimova commented Jan 28, 2022

Add SIMD for GAPI Fluid DivRC kernel.

Performance report:
FluidDivRCSIMD.xlsx

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2

Copy link
Contributor

@sivanov-work sivanov-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anna-khakimova
I checked out 4 files from 6 and have understood that it takes too much time then i supposed.
If this is urgent then I suggest you to find another reviewer: I'm very sorry but I have much load on this week. Furthermore we need to finish another one PR for DX11/OpenCL which had started before DivRC and got several unresolved questions.

Sorry, i cannot proceed with this PR on this week

@@ -936,8 +936,8 @@ CV_ALWAYS_INLINE void run_arithm_s(Buffer &dst, const View &src, const float sca
}

template<typename DST, typename SRC>
static void run_arithm_rs(Buffer &dst, const View &src, const float scalar[4], Arithm arithm,
float scale=1)
CV_ALWAYS_INLINE void run_arithm_rs(Buffer &dst, const View &src, const float scalar[],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is scalar size changed from 4?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scalar array is a pointer to scratch buffer that was filled in a special way.

GMatDesc bufdesc = { CV_32F, 1, bufsize };
Buffer buffer(bufdesc);
scratch = std::move(buffer);
scratch = std::move(Buffer(GMatDesc(CV_32F, 1, cv::Size(buflen, 1))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to remove std::mode or swap it if available

template<typename scalar_t>
using univ_zero_vec_type_of_t = typename univ_zero_vec_type_of<scalar_t>::type;

template<> struct univ_zero_vec_type_of<uchar> { using type = v_uint8; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean - univ?

template<typename scalar_t>
using zero_vec_type_of_t = typename zero_vec_type_of<scalar_t>::type;

template<> struct zero_vec_type_of<uchar> { using type = v_int16; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int16 for uchar? is it not mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not mistake it's implementation specifics.

} \
} \
default: \
GAPI_Assert(chan <= 4); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there may be warning for walking through switch-cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no warnings.

Copy link
Member

@alexgiving alexgiving left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@rgarnov rgarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@@ -528,6 +528,10 @@ PERF_TEST_P_(DivRCPerfTest, TestPerformance)

// FIXIT Unstable input data for divide
initMatsRandU(type, sz, dtype, false);
//This condition need as workaround the bug in the OpenCV.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, It makes sense to add reference to the mentioned bug. #21044

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalek I propose to apply this comment in the next PR in order not to restart the CI checks once again.

Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but please address Alexander's comment

@alalek alalek merged commit 9c7adb7 into opencv:4.x Mar 2, 2022
@opencv-pushbot opencv-pushbot mentioned this pull request Apr 23, 2022
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
* GAPI Fluid: SIMD for DivRC kernel.

* Fluid: Div kernel's SIMD refactoring

* SIMD for DivRC 3 channel case

* Applied comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants