[RFC] rounding rule for cv::divide() #24213

Kumataro · 2023-09-01T09:53:35Z

System Information

OpenCV version: 4.x branch
Operating System / Platform: Ubuntu 20.04 (Raspi4, arm64)
Compiler & compiler version: GCC 9.3.0

Detailed description

This issue is about cv::divide(). related with #24074
Rounding rule for it is not describled at cv::divide()

In arm64, it works with Round to nearest, ties away from zero
In x86-64, it works with Round to nearest, ties to even.
(Possibly, the behavior may change according to the rounding mode specification of the floating point unit.)

Q. Their results are different. Which are these behaviours correct/better ?

If arm64 behaviours should be fixed...

For arm64, I think it can fix following patch.
However v_round() function seems to used many times.
I feel the risk of breaking backwards compatibility.

I would appreciate it if you could comment on this issue.

before : https://developer.arm.com/architectures/instruction-sets/intrinsics/vcvtaq_s64_f64
after : https://developer.arm.com/architectures/instruction-sets/intrinsics/vcvtnq_s64_f64

diff --git a/modules/core/include/opencv2/core/hal/intrin_neon.hpp b/modules/core/include/opencv2/core/hal/intrin_neon.hpp
index 6f8973231b..14eb180819 100644
--- a/modules/core/include/opencv2/core/hal/intrin_neon.hpp
+++ b/modules/core/include/opencv2/core/hal/intrin_neon.hpp
@@ -1997,12 +1997,12 @@ inline v_int32x4 v_trunc(const v_float32x4& a)
 inline v_int32x4 v_round(const v_float64x2& a)
 {
     static const int32x2_t zero = vdup_n_s32(0);
-    return v_int32x4(vcombine_s32(vmovn_s64(vcvtaq_s64_f64(a.val)), zero));
+    return v_int32x4(vcombine_s32(vmovn_s64(vcvtnq_s64_f64(a.val)), zero));
 }

 inline v_int32x4 v_round(const v_float64x2& a, const v_float64x2& b)
 {
-    return v_int32x4(vcombine_s32(vmovn_s64(vcvtaq_s64_f64(a.val)), vmovn_s64(vcvtaq_s64_f64(b.val))));
+    return v_int32x4(vcombine_s32(vmovn_s64(vcvtnq_s64_f64(a.val)), vmovn_s64(vcvtnq_s64_f64(b.val))));
 }

Steps to reproduce

#include <opencv2/core.hpp>
#include <iostream>

int main(void)
{
  cv::Mat src1 = (cv::Mat_<uchar>(3,3) << 25,23,0, 0,0,0, 0,0,0 );
  std::cout << src1 << std::endl;
  cv::Mat dst;
  cv::divide(src1, 2, dst );
  std::cout << dst << std::endl;
  return 0;
}

[x86-64]
kmtr@kmtr-VMware-Virtual-Platform:~/work/studyT2$ ./a.out
[ 25,  23,   0;
   0,   0,   0;
   0,   0,   0]
[ 12,  12,   0;
   0,   0,   0;
   0,   0,   0]

[arm64]
kmtr@ubuntu:~/work/build4-main/study$ ./a.out
[ 25,  23,   0;
   0,   0,   0;
   0,   0,   0]
[ 13,  12,   0;
   0,   0,   0;
   0,   0,   0]

Issue submission checklist

I report the issue, it's not a question
I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
I updated to the latest OpenCV version and the issue is still there
There is reproducer code and related data files (videos, images, onnx, etc)

The text was updated successfully, but these errors were encountered:

opencv-alalek · 2023-09-02T06:43:04Z

Perhaps it make sense to update intrinsic tests with v_round(): https://github.com/opencv/opencv/blob/4.8.0/modules/core/test/test_intrin_utils.hpp#L1483

Kumataro · 2023-09-02T10:18:48Z

Thank you for your comment !

I felt that the existing tests were sufficient to confirm the impact and the side effects.
I tried the existing tests, including core module, on 64bit arm Ubuntu running on Raspi4 and found no new issues.
I created Pull Request with new test code.

This fix should also affect MacOS ARM64, but I don't have the equipment and couldn't try it. I'm sorry

Supplement.
The same Neon instruction is used in v_truncate().
When using v_truncate, I think it is better to move it to the zero side when the decimal part becomes 0.5.

Kumataro added the bug label Sep 1, 2023

opencv-alalek added optimization category: core RFC platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc labels Sep 2, 2023

Kumataro mentioned this issue Sep 2, 2023

core: arm64: v_round() works with round to nearest, ties to even. #24215

Merged

6 tasks

opencv-alalek added this to the 4.9.0 milestone Sep 3, 2023

asmorkalov closed this as completed in #24215 Sep 4, 2023

opencv-alalek removed the RFC label Sep 4, 2023

asmorkalov mentioned this issue Sep 4, 2023

Carotene (ARM HAL) uses wrong rounding in some places #24163

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] rounding rule for cv::divide() #24213

[RFC] rounding rule for cv::divide() #24213

Kumataro commented Sep 1, 2023

opencv-alalek commented Sep 2, 2023

Kumataro commented Sep 2, 2023

[RFC] rounding rule for cv::divide() #24213

[RFC] rounding rule for cv::divide() #24213

Comments

Kumataro commented Sep 1, 2023

System Information

Detailed description

If arm64 behaviours should be fixed...

Steps to reproduce

Issue submission checklist

opencv-alalek commented Sep 2, 2023

Kumataro commented Sep 2, 2023