Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added quantized division for uint8 #26570

Merged

Conversation

mwtarnowski
Copy link
Contributor

Added simple, non-optimized implementation of quantized division for unsigned 8-bit integers to reference ops. Implemented a few tests together with templates that can be used for testing quantized Div for other data types as well.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@rthadur rthadur requested a review from haozha111 March 11, 2019 21:54
@rthadur rthadur self-assigned this Mar 11, 2019
@rthadur rthadur added this to Assigned Reviewer in PR Queue via automation Mar 11, 2019
@rthadur rthadur added the size:L CL Change Size: Large label Mar 11, 2019
@rthadur
Copy link
Contributor

rthadur commented Mar 11, 2019

@mwtarnowski please sign CLA

@herbakamil
Copy link

@rthadur Michał and @w-adamski are members of https://github.com/TCLResearchEurope
and this organization has already signed CLA. Does @bbiskupski (our CEO who has signed CLA) needs to do something so that they can send PRs?

@mwtarnowski
Copy link
Contributor Author

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@haozha111 haozha111 removed their request for review April 5, 2019 18:38
@mwtarnowski
Copy link
Contributor Author

@bjacob, thanks for your feedback and a comprehensive explanation of the problem. In fact, I've already implemented integer-only version of this operation with gemmlowp library. However, during several benchmarks, I found that, even on mobile devices with limited support of floating-point operations, an unoptimized (no LUT, no NEON) integer-only implementation performs poorly in comparison to the approach I proposed here. For my purposes, this implementation provided demanded functionality with fair performance and being pretty simple at the same time, which made it a good candidate for a reference.

But indeed, when it comes to ensure maximum portability, it makes perfect sense to avoid using floats. So I am updating this PR eliminating all floating point operations from the code. Any further suggestions are welcome.

@mwtarnowski
Copy link
Contributor Author

@bjacob, @suharshs, do you know when this code will be reviewed? Is there anything I should fix/change?

@bjacob
Copy link
Contributor

bjacob commented May 14, 2019

Sorry, I had forgotten. This looks good to me.

bjacob
bjacob previously approved these changes May 14, 2019
Copy link
Contributor

@bjacob bjacob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer May 14, 2019
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels May 14, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 14, 2019
@herbakamil
Copy link

@bjacob what is the further procedure? PR has been accepted and what are further steps needed to merge it?

@bjacob
Copy link
Contributor

bjacob commented May 15, 2019

@herbakamil The normal procedure is that this PR has been converted to a Google-internal "change" that has been entered into integration tests. Unfortunately, there are some real test failures, so this has been blocked at that stage there. The test failures are in this test target:

//tensorflow/lite/kernels:div_test

You should be able to reproduce this test failure by this command:

bazel test //tensorflow/lite/kernels:div_test

Please update this PR so as to make this test pass, then we can let this go through integration again.

The other failures reported above here ("Windows Bazel") look like spurious/flaky failures not actually caused by your PR, at least from a cursory look.

@mwtarnowski
Copy link
Contributor Author

@bjacob, can you provide more details? I am not able to reproduce this test failure on Ubuntu 18.04 with Bazel 0.24.1 and GCC 7.4.

@bjacob
Copy link
Contributor

bjacob commented May 16, 2019

(Edit: this is on a Linux setup fairly similar to yours, debian-based. the compiler is some recent clang).

It seems to be failing in the new tests which this PR adds:

[==========] Running 12 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 4 tests from FloatDivOpTest
[ RUN      ] FloatDivOpTest.NoActivation
INFO: Initialized TensorFlow Lite runtime.
[       OK ] FloatDivOpTest.NoActivation (7 ms)
[ RUN      ] FloatDivOpTest.ActivationRELU_N1_TO_1
[       OK ] FloatDivOpTest.ActivationRELU_N1_TO_1 (2 ms)
[ RUN      ] FloatDivOpTest.VariousInputShapes
[       OK ] FloatDivOpTest.VariousInputShapes (2 ms)
[ RUN      ] FloatDivOpTest.WithBroadcast
[       OK ] FloatDivOpTest.WithBroadcast (2 ms)
[----------] 4 tests from FloatDivOpTest (13 ms total)

[----------] 4 tests from IntegerDivOpTest
[ RUN      ] IntegerDivOpTest.NoActivation
[       OK ] IntegerDivOpTest.NoActivation (2 ms)
[ RUN      ] IntegerDivOpTest.ActivationRELU_N1_TO_1
[       OK ] IntegerDivOpTest.ActivationRELU_N1_TO_1 (2 ms)
[ RUN      ] IntegerDivOpTest.VariousInputShapes
[       OK ] IntegerDivOpTest.VariousInputShapes (2 ms)
[ RUN      ] IntegerDivOpTest.WithBroadcast
[       OK ] IntegerDivOpTest.WithBroadcast (2 ms)
[----------] 4 tests from IntegerDivOpTest (8 ms total)

[----------] 4 tests from QuantizedDivOpTest
[ RUN      ] QuantizedDivOpTest.QuantizedNoActivationUInt8
third_party/tensorflow/lite/kernels/div_test.cc:199: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 4 elements where
element #0 is approximately 1 (absolute error <= 0.015747791),
element #1 is approximately -0.5 (absolute error <= 0.015747791),
element #2 is approximately 0.375 (absolute error <= 0.015747791),
element #3 is approximately 0.69999999 (absolute error <= 0.015747791)
  Actual: { -0.25098, 0.133333, 0.368627, 0.698039 }, whose element #0 doesn't match, which is -1.25098 from 1
Stack trace:
0x7f7778d3b915: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedNoActivationUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

[  FAILED  ] QuantizedDivOpTest.QuantizedNoActivationUInt8 (34 ms)
[ RUN      ] QuantizedDivOpTest.QuantizedActivationRELU_N1_TO_1UInt8
third_party/tensorflow/lite/kernels/div_test.cc:225: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 4 elements where
element #0 is approximately -1 (absolute error <= 0.015747791),
element #1 is approximately 0.5 (absolute error <= 0.015747791),
element #2 is approximately 1 (absolute error <= 0.015747791),
element #3 is approximately -0.875 (absolute error <= 0.015747791)
  Actual: { 0.337255, 0.509804, 0.996078, -0.870588 }, whose element #0 doesn't match, which is 1.33725 from -1
With test number 0
Stack trace:
0x7f7778d3c766: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedActivationRELU_N1_TO_1UInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

[  FAILED  ] QuantizedDivOpTest.QuantizedActivationRELU_N1_TO_1UInt8 (3 ms)
[ RUN      ] QuantizedDivOpTest.QuantizedVariousInputShapesUInt8
third_party/tensorflow/lite/kernels/div_test.cc:252: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -1.538 (absolute error <= 0.047612458),
element #1 is approximately 0.667 (absolute error <= 0.047612458),
element #2 is approximately 1.545 (absolute error <= 0.047612458),
element #3 is approximately 2.25 (absolute error <= 0.047612458),
element #4 is approximately -0.36399999 (absolute error <= 0.047612458),
element #5 is approximately 1.053 (absolute error <= 0.047612458)
  Actual: { 0.776471, 0.682353, 1.52941, 2.23529, -0.352941, 1.05882 }, whose element #0 doesn't match, which is 2.31447 from -1.538
With shape number 0
Stack trace:
0x7f7778d3d34d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedVariousInputShapesUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:252: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -1.538 (absolute error <= 0.047612458),
element #1 is approximately 0.667 (absolute error <= 0.047612458),
element #2 is approximately 1.545 (absolute error <= 0.047612458),
element #3 is approximately 2.25 (absolute error <= 0.047612458),
element #4 is approximately -0.36399999 (absolute error <= 0.047612458),
element #5 is approximately 1.053 (absolute error <= 0.047612458)
  Actual: { 0.776471, 0.682353, 1.52941, 2.23529, -0.352941, 1.05882 }, whose element #0 doesn't match, which is 2.31447 from -1.538
With shape number 1
Stack trace:
0x7f7778d3d34d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedVariousInputShapesUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:252: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -1.538 (absolute error <= 0.047612458),
element #1 is approximately 0.667 (absolute error <= 0.047612458),
element #2 is approximately 1.545 (absolute error <= 0.047612458),
element #3 is approximately 2.25 (absolute error <= 0.047612458),
element #4 is approximately -0.36399999 (absolute error <= 0.047612458),
element #5 is approximately 1.053 (absolute error <= 0.047612458)
  Actual: { 0.776471, 0.682353, 1.52941, 2.23529, -0.352941, 1.05882 }, whose element #0 doesn't match, which is 2.31447 from -1.538
With shape number 2
Stack trace:
0x7f7778d3d34d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedVariousInputShapesUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:252: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -1.538 (absolute error <= 0.047612458),
element #1 is approximately 0.667 (absolute error <= 0.047612458),
element #2 is approximately 1.545 (absolute error <= 0.047612458),
element #3 is approximately 2.25 (absolute error <= 0.047612458),
element #4 is approximately -0.36399999 (absolute error <= 0.047612458),
element #5 is approximately 1.053 (absolute error <= 0.047612458)
  Actual: { 0.776471, 0.682353, 1.52941, 2.23529, -0.352941, 1.05882 }, whose element #0 doesn't match, which is 2.31447 from -1.538
With shape number 3
Stack trace:
0x7f7778d3d34d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedVariousInputShapesUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

[  FAILED  ] QuantizedDivOpTest.QuantizedVariousInputShapesUInt8 (3 ms)
[ RUN      ] QuantizedDivOpTest.QuantizedWithBroadcastUInt8
third_party/tensorflow/lite/kernels/div_test.cc:277: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -2.8570001 (absolute error <= 0.047612458),
element #1 is approximately 0.28600001 (absolute error <= 0.047612458),
element #2 is approximately 1 (absolute error <= 0.047612458),
element #3 is approximately 1.143 (absolute error <= 0.047612458),
element #4 is approximately -0.71399999 (absolute error <= 0.047612458),
element #5 is approximately 1.571 (absolute error <= 0.047612458)
  Actual: { 1.43529, 0.305882, 0.988235, 1.12941, 0.376471, 1.57647 }, whose element #0 doesn't match, which is 4.29229 from -2.857
With shape number 0
Stack trace:
0x7f7778d3dd8d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedWithBroadcastUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:277: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -2.8570001 (absolute error <= 0.047612458),
element #1 is approximately 0.28600001 (absolute error <= 0.047612458),
element #2 is approximately 1 (absolute error <= 0.047612458),
element #3 is approximately 1.143 (absolute error <= 0.047612458),
element #4 is approximately -0.71399999 (absolute error <= 0.047612458),
element #5 is approximately 1.571 (absolute error <= 0.047612458)
  Actual: { 1.43529, 0.305882, 0.988235, 1.12941, 0.376471, 1.57647 }, whose element #0 doesn't match, which is 4.29229 from -2.857
With shape number 1
Stack trace:
0x7f7778d3dd8d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedWithBroadcastUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:277: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -2.8570001 (absolute error <= 0.047612458),
element #1 is approximately 0.28600001 (absolute error <= 0.047612458),
element #2 is approximately 1 (absolute error <= 0.047612458),
element #3 is approximately 1.143 (absolute error <= 0.047612458),
element #4 is approximately -0.71399999 (absolute error <= 0.047612458),
element #5 is approximately 1.571 (absolute error <= 0.047612458)
  Actual: { 1.43529, 0.305882, 0.988235, 1.12941, 0.376471, 1.57647 }, whose element #0 doesn't match, which is 4.29229 from -2.857
With shape number 2
Stack trace:
0x7f7778d3dd8d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedWithBroadcastUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

third_party/tensorflow/lite/kernels/div_test.cc:277: Failure
Value of: m.GetDequantizedOutput<integer_dtype>()
Expected: has 6 elements where
element #0 is approximately -2.8570001 (absolute error <= 0.047612458),
element #1 is approximately 0.28600001 (absolute error <= 0.047612458),
element #2 is approximately 1 (absolute error <= 0.047612458),
element #3 is approximately 1.143 (absolute error <= 0.047612458),
element #4 is approximately -0.71399999 (absolute error <= 0.047612458),
element #5 is approximately 1.571 (absolute error <= 0.047612458)
  Actual: { 1.43529, 0.305882, 0.988235, 1.12941, 0.376471, 1.57647 }, whose element #0 doesn't match, which is 4.29229 from -2.857
With shape number 3
Stack trace:
0x7f7778d3dd8d: tflite::(anonymous namespace)::QuantizedDivOpTest_QuantizedWithBroadcastUInt8_Test::TestBody() @ ??:??
0x7f7777ae4ead: testing::Test::Run() @ ??:??
0x7f7777ae6794: testing::TestInfo::Run() @ ??:??
... Google Test internal frames ...

[  FAILED  ] QuantizedDivOpTest.QuantizedWithBroadcastUInt8 (3 ms)
[----------] 4 tests from QuantizedDivOpTest (43 ms total)

[----------] Global test environment tear-down
[==========] 12 tests from 3 test suites ran. (64 ms total)
[  PASSED  ] 8 tests.
[  FAILED  ] 4 tests, listed below:
[  FAILED  ] QuantizedDivOpTest.QuantizedNoActivationUInt8
[  FAILED  ] QuantizedDivOpTest.QuantizedActivationRELU_N1_TO_1UInt8
[  FAILED  ] QuantizedDivOpTest.QuantizedVariousInputShapesUInt8
[  FAILED  ] QuantizedDivOpTest.QuantizedWithBroadcastUInt8

 4 FAILED TESTS

@bjacob
Copy link
Contributor

bjacob commented May 16, 2019

In the first of these failures,

element #0 is approximately 1 (absolute error <= 0.015747791),
element #1 is approximately -0.5 (absolute error <= 0.015747791),
element #2 is approximately 0.375 (absolute error <= 0.015747791),
element #3 is approximately 0.69999999 (absolute error <= 0.015747791)
  Actual: { -0.25098, 0.133333, 0.368627, 0.698039 }

Notice how it's the 2 first out of these 4 values that are off, while the 2 last values are OK (within the stated tolerance).

PR Queue automation moved this from Approved by Reviewer to Reviewer Requested Changes May 16, 2019
@rthadur rthadur requested a review from bjacob May 16, 2019 19:46
@rthadur rthadur added the comp:lite TF Lite related issues label May 16, 2019
@tensorflow-bot tensorflow-bot bot added the kokoro:force-run Tests on submitted change label May 17, 2019
PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer May 17, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 17, 2019
@mwtarnowski
Copy link
Contributor Author

@bjacob, is now everything ok?

@bjacob
Copy link
Contributor

bjacob commented May 22, 2019

@mwtarnowski I just inquired again. It's OK as far as I can see, someone needs to submit it into integration again.

@rthadur rthadur added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Jun 21, 2019
@tensorflow-copybara tensorflow-copybara merged commit 769fd05 into tensorflow:master Jun 21, 2019
PR Queue automation moved this from Approved by Reviewer to Merged Jun 21, 2019
tensorflow-copybara pushed a commit that referenced this pull request Jun 21, 2019
@mwtarnowski mwtarnowski deleted the quantized-div branch June 27, 2019 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:lite TF Lite related issues ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

7 participants