GAPI Fluid: SIMD Div kernel. #20914

anna-khakimova · 2021-10-20T22:16:25Z

SIMD for Divide kernel.

Performance report:

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2

sivanov-work

LGTM (besides assert for unsupported types comment in div_hal)

modules/gapi/perf/cpu/gapi_core_perf_tests_fluid.cpp

modules/gapi/src/backends/fluid/gfluidcore.cpp

sivanov-work · 2021-10-21T06:16:16Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

@@ -743,6 +782,7 @@ GAPI_FLUID_KERNEL(GFluidDiv, cv::gapi::core::GDiv, false)
        BINARY_(uchar ,  short,  short, run_arithm, dst, src1, src2, ARITHM_DIVIDE, scale);
        BINARY_(uchar ,  float,  float, run_arithm, dst, src1, src2, ARITHM_DIVIDE, scale);
        BINARY_( short,  short,  short, run_arithm, dst, src1, src2, ARITHM_DIVIDE, scale);
+        BINARY_(ushort, ushort, ushort, run_arithm, dst, src1, src2, ARITHM_DIVIDE, scale);


Comment not for this PR:
maybe in future it would be nice to have something:

GAPI_KERNEL(...) ... #define KERNEL_XXX_SUPPORTED_TYPES ushort, int, char,... static void run(...) { run_dispatcher<KERNEL_XXX_SUPPORTED_TYPES>(...); } template <class ....Types> run_dispatcher(...) { bool expander[] { (run_impl<Type>(), true)...} } template<class SupportedType> run_impl(...) { BINARY_(SupportedType, ...); }

Probably yes, however for this purpose it's need to rework the entire Fluid backend. It will take a certain amount of time. As you know the allocation of resources is handled by @dmatveev , so this issue should be discussed with him.

modules/gapi/perf/cpu/gapi_core_perf_tests_fluid.cpp

alalek · 2021-10-28T22:44:45Z

modules/gapi/perf/common/gapi_core_perf_tests_inl.hpp

-    cv::divide(in_mat1, in_mat2, out_mat_ocv, dtype);
-
+    cv::divide(in_mat1, in_mat2, out_mat_ocv, scale, dtype);
+    //out_mat_ocv.size().


//out_mat_ocv.size().

?

alalek · 2021-10-28T22:58:59Z

modules/gapi/src/backends/fluid/gfluidcore_func.simd.hpp

@@ -0,0 +1,427 @@
+// This file is part of OpenCV project.


.simd.hpp

These files are used for dynamic dispatching between multiple instruction sets.

Check:

documentation / tutorial here: https://github.com/opencv/opencv/wiki/CPU-optimizations-build-options

some of these PRs

how it is used gfluidimgproc_func.simd.hpp in G-API

Ok. Reworked. I've applied dynamic dispatching for gfluidcore_func.simd.hpp the same as it has been applied for gfluidimgproc_func.simd.hpp

alalek

Looks good to me! Thank you for update 👍

alalek · 2021-11-08T11:03:58Z

modules/gapi/src/backends/fluid/gfluidcore_func.simd.hpp

+CV_ALWAYS_INLINE void v_store_select(short* dst, const v_int16&, const v_int16&,
+                                     const v_int32& res1, const v_int32& res2)
+{
+    vx_store(dst, v_pack(res1, res2));


v_store_select
v_pack

Do we need select somewhere here?

Yes. Reworked. This lines were done deliberately as a workaround for a known issue in OCV.
Since one more workaround has been added to the DivPerfTestFluid test, this workaround can be reworked.

alalek · 2021-11-08T11:05:07Z

modules/gapi/src/backends/fluid/gfluidcore_func.simd.hpp

+template<> struct vector_type_of<ushort> { using type = v_uint16; };
+template<> struct vector_type_of<short> { using type = v_int16; };
+
+CV_ALWAYS_INLINE v_float32 v_load_f32(const float* in)


v_load_f32

It makes sense to add "gapi_v_" prefix to avoid confusion with OpenCV SIMD functions.

Ok. However "gapi_v_" too long. I replaced "v" prefix to "vg" prefix.

alalek · 2021-11-08T11:07:28Z

modules/gapi/perf/common/gapi_core_perf_tests_inl.hpp

+    //This condition need to workaround bug in OpenCV.
+    //It reinitializes divider matrix without zero values.
+    if (dtype == CV_16S && type != CV_16S)
+        cv::randu(in_mat2, cv::Scalar::all(1), cv::Scalar::all(255));


BTW, There is similar problem in cv::divide() with dst=8S (e.g., src=8U).

Probably accurate condition should be dtype != type ( && dtype != CV_32F)

Fortunately, fluid's Div kernel doesn't support 8S type at all.

For supported types:
Since bug observed only when DST_Type == 16S and SRC_Type != 16S (both condition together), I think accurate condition is
(dtype == 16S) && (dtype != type)

This condition has already been applied in the last PR's update.

For all other combinations of DST and SRC types, we must check handling of division by zero cases with this test.

alalek · 2021-11-11T23:46:47Z

modules/gapi/include/opencv2/gapi/core.hpp

@@ -770,7 +770,10 @@ GAPI_EXPORTS GMat mulC(const GScalar& multiplier, const GMat& src, int ddepth =
 The function divides one matrix by another:
 \f[\texttt{dst(I) = saturate(src1(I)*scale/src2(I))}\f]

-When src2(I) is zero, dst(I) will also be zero. Different channels of
+For integer types when src2(I) is zero, dst(I) will also be zero.
+Floating point case returns Inf/NaN (according to IEEE).


Does this PR resolve #13147 ?

Unfortunately, I haven't heard about this issue earlier. However, I can say with confidence that during testing with DivPerfTestFluid test, behavior of the Div fluid's kernel is fully consistent with the behavior of cv::divide() operation. All CI checks successfully passed. The exception is the behavior is caused with known issue in OpenCV (dtype== CV_16S && type != dtype). For mentioned OCV's issue workaround has already added to test.

Since all CI checks successfully passed, I can assume that mentioned issue in PR #13147 was successfully resolved.

alalek · 2021-11-11T23:47:03Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

+static inline
+typename std::enable_if<!std::is_same<DST, float>::value, DST>::type
+div(SRC1 x, SRC2 y, float scale=1)
 {
-    // like OpenCV: returns 0, if y=0
+    // like OpenCV: returns 0, if DST type=uchar/short/ushort and divider(y)=0
    auto result = y? scale * x / y: 0;


BTW, We saw tests failures for "Linux Debug" builder only.
This is strange because SIMD optimized builders should fail too.

At the beginning I saw test failures not for only "Linux Debug". I added workaround for SIMD part long time ago. We've already discussed this workaround here. When we have a call, I've already told you about this workaround. It's already removed with applying this comment.

alalek · 2021-11-11T23:48:45Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

+    auto result = scale * x / y;
+    return saturate<DST>(result, rintf);
+}
+
 template<typename DST, typename SRC1, typename SRC2>
 static inline DST divr(SRC1 x, SRC2 y, float scale=1)


divr

has the same flow, but it is still not updated.

Within current task this operation wasn't and won't be updated. I've added SIMD for Div kernel only.

BTW, divr() function is declared but not used anywhere in gfluidcore.cpp

dmatveev

LGTM but not sure if everything is ok about .simd.hpp

GAPI Fluid: SIMD Div kernel. * HAL implementation for Div kernel * Removed dbg lines * Applied comments. * Reworked * Final version

Anna Khakimova added 2 commits October 21, 2021 00:46

HAL implementation for Div kernel

be56584

Removed dbg lines

00d7a3a

anna-khakimova requested review from TolyaTalamanov and sivanov-work October 20, 2021 22:25

asmorkalov added category: g-api / gapi optimization labels Oct 21, 2021

sivanov-work approved these changes Oct 21, 2021

View reviewed changes

alalek reviewed Oct 21, 2021

View reviewed changes

modules/gapi/perf/cpu/gapi_core_perf_tests_fluid.cpp Show resolved Hide resolved

Anna Khakimova added 2 commits October 22, 2021 00:59

Applied comments.

099a520

Reworked

a595cce

anna-khakimova changed the title ~~GAPI Fluid: Reworked Div kernel.~~ GAPI Fluid: SIMD Div kernel. Oct 28, 2021

anna-khakimova force-pushed the ak/simd_div branch from 525c9a3 to eace9c7 Compare October 28, 2021 22:31

anna-khakimova requested a review from alalek October 28, 2021 22:33

alalek reviewed Oct 28, 2021

View reviewed changes

anna-khakimova requested a review from terfendail October 29, 2021 08:11

anna-khakimova force-pushed the ak/simd_div branch from eace9c7 to 54e5744 Compare October 29, 2021 08:56

anna-khakimova force-pushed the ak/simd_div branch 2 times, most recently from 6e7af53 to fd6a2fe Compare November 6, 2021 21:32

anna-khakimova requested a review from alalek November 6, 2021 22:48

anna-khakimova force-pushed the ak/simd_div branch from fd6a2fe to d69ae64 Compare November 6, 2021 22:49

alalek reviewed Nov 8, 2021

View reviewed changes

Final version

e54fafe

anna-khakimova force-pushed the ak/simd_div branch from d69ae64 to e54fafe Compare November 10, 2021 09:40

anna-khakimova requested a review from alalek November 10, 2021 09:59

alalek reviewed Nov 11, 2021

View reviewed changes

anna-khakimova requested a review from alalek November 12, 2021 09:30

dmatveev self-assigned this Nov 12, 2021

dmatveev added this to the 4.5.5 milestone Nov 12, 2021

dmatveev approved these changes Nov 15, 2021

View reviewed changes

alalek merged commit b19697e into opencv:4.x Nov 15, 2021

alalek mentioned this pull request Dec 30, 2021

(5.x) Merge 4.x #21371

Merged

alalek mentioned this pull request Feb 22, 2022

(5.x) Merge 4.x #21651

Merged

opencv-alalek mentioned this pull request Aug 30, 2024

Workaround for division by zero, issue #25795 #26030

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GAPI Fluid: SIMD Div kernel. #20914

GAPI Fluid: SIMD Div kernel. #20914

anna-khakimova commented Oct 20, 2021 •

edited

Loading

sivanov-work left a comment

sivanov-work Oct 21, 2021

anna-khakimova Oct 21, 2021

alalek Oct 28, 2021

anna-khakimova Nov 6, 2021

alalek Oct 28, 2021 •

edited

Loading

anna-khakimova Nov 6, 2021 •

edited

Loading

alalek left a comment

alalek Nov 8, 2021

anna-khakimova Nov 10, 2021 •

edited

Loading

alalek Nov 8, 2021

anna-khakimova Nov 10, 2021

alalek Nov 8, 2021

anna-khakimova Nov 10, 2021 •

edited

Loading

alalek Nov 11, 2021

anna-khakimova Nov 12, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021

alalek Nov 11, 2021

anna-khakimova Nov 12, 2021 •

edited

Loading

alalek Nov 11, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021

dmatveev left a comment

GAPI Fluid: SIMD Div kernel. #20914

GAPI Fluid: SIMD Div kernel. #20914

Conversation

anna-khakimova commented Oct 20, 2021 • edited Loading

sivanov-work left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alalek Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

anna-khakimova Nov 6, 2021 • edited Loading

Choose a reason for hiding this comment

alalek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anna-khakimova Nov 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anna-khakimova Nov 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anna-khakimova Nov 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anna-khakimova Nov 12, 2021 • edited Loading

Choose a reason for hiding this comment

alalek Nov 11, 2021 • edited Loading

Choose a reason for hiding this comment

anna-khakimova Nov 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmatveev left a comment

Choose a reason for hiding this comment

anna-khakimova commented Oct 20, 2021 •

edited

Loading

alalek Oct 28, 2021 •

edited

Loading

anna-khakimova Nov 6, 2021 •

edited

Loading

anna-khakimova Nov 10, 2021 •

edited

Loading

anna-khakimova Nov 10, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021 •

edited

Loading

alalek Nov 11, 2021 •

edited

Loading

anna-khakimova Nov 12, 2021 •

edited

Loading