Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support per-channel quantized INT8 weights unpacking in XNNPACK delegate #50875

Conversation

dev0x13
Copy link
Contributor

@dev0x13 dev0x13 commented Jul 21, 2021

This MR expands INT8 weights unpacking for FP32 inference in XNNPACK delegate support added in the previous MR for per-channel dynamic range quantized model. Previous changes have only supported per-tensor quantization mode, which is obsolete in recent TensorFlow releases.
I have not added proper testing yet, because I'd like to get some suggestions from a maintainer, i.e. the best way to organize such testing while keeping the codebase clean (specifically, how to change testers). Thank you in advance!

@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label Jul 21, 2021
@google-cla google-cla bot added the cla: yes label Jul 21, 2021
@gbaned gbaned self-assigned this Jul 21, 2021
@gbaned gbaned added the comp:lite TF Lite related issues label Jul 21, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jul 21, 2021
@gbaned gbaned requested a review from wangtz July 21, 2021 15:11
@gbaned gbaned added the awaiting review Pull request awaiting review label Jul 23, 2021
@dev0x13
Copy link
Contributor Author

dev0x13 commented Aug 2, 2021

@multiverse-tf Could you review this please?

@gbaned gbaned requested review from multiverse-tf and removed request for multiverse-tf August 9, 2021 15:31
@gbaned
Copy link
Contributor

gbaned commented Aug 10, 2021

@dev0x13 Can you please resolve conflicts? Thanks!

@gbaned gbaned added stat:awaiting response Status - Awaiting response from author and removed awaiting review Pull request awaiting review labels Aug 10, 2021
@dev0x13
Copy link
Contributor Author

dev0x13 commented Aug 10, 2021

@gbaned Done

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Aug 12, 2021
@dev0x13
Copy link
Contributor Author

dev0x13 commented Aug 17, 2021

@multiverse-tf Could you review this please?

@dev0x13
Copy link
Contributor Author

dev0x13 commented Aug 25, 2021

It's been more than a month since I've created this MR. Should I close it now due to the lack of review?

@multiverse-tf
Copy link
Contributor

It's been more than a month since I've created this MR. Should I close it now due to the lack of review?

Sorry for not catching this review request earlier. In the past month, I think we've also added native per-channel quantized INT8 op support in XNNPACK. @Maratyszcza, could you shed more light on the support? Thx

@multiverse-tf multiverse-tf removed the request for review from wangtz August 25, 2021 08:55
@dev0x13
Copy link
Contributor Author

dev0x13 commented Aug 27, 2021

@multiverse-tf XNNPACK does not support per-channel dynamic range quantization. This MR adds the support.

@dev0x13
Copy link
Contributor Author

dev0x13 commented Sep 13, 2021

Closing due to the lack of response from maintainers.

@dev0x13 dev0x13 closed this Sep 13, 2021
PR Queue automation moved this from Assigned Reviewer to Closed/Rejected Sep 13, 2021
@dev0x13 dev0x13 reopened this Oct 17, 2021
PR Queue automation moved this from Closed/Rejected to Assigned Reviewer Oct 17, 2021
PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Nov 15, 2021
…ate (MR review fixes)

1. Minor cosmetic fixes suggested by MR reviewer are applied to XNNPACK
delegate test set.
2. Tests for per-channel quantized INT8 weights unpacking in XNNPACK
delegate are refactored to perform proper per-channel quantization
parameters computation at runtime instead of using random-initialized
quantized tensors.
3. Added per-tensor and per-channel quantized weights unpacking
tests for TRANSPOSE_CONV op in XNNPACK delegate.
@dev0x13
Copy link
Contributor Author

dev0x13 commented Nov 16, 2021

@Maratyszcza Thank you for the review! All the changes are made.

@@ -66,7 +66,7 @@ void BinaryElementwiseTester::Test(tflite::BuiltinOperator binary_op,
if (Input1Static()) {
ASSERT_FALSE(Input2Static());
}
if (FP16Weights() || INT8Weights()) {
if (FP16Weights() || INT8Weights() || INT8ChannelWiseWeights()) {
ASSERT_TRUE(Input1Static() || Input2Static());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that if channelwise weights are used, the static input has at least one dimension (i.e. isn't a scalar)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be if (INT8ChannelWiseWeights() && Input1Static()) ... + the same for input2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad. Fixed

std::bind(QuantizeInt8, std::placeholders::_1, 0, input1_scale));
input1_scales.resize(1);
input1_zero_points.resize(1, 0);
input1_scales[0] = GetInt8QuantizationScale(input1_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine with resize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

std::vector<int> current_dim(num_dims, 0);

do {
size_t offset =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const size_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

current_dim.data(), 0, nullptr);
const int channel_idx = current_dim[quantized_dimension];
const float val = data[offset];
if (has_min_max_value[channel_idx]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize min to -std::numeric_limits<float>::infinity() and max to +std::numeric_limits<float>::infinity() and remove these checks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@Maratyszcza Maratyszcza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG overall. Please revert the change in README and simplify the code per comments, and we're good to go.

…ate (MR review fixes)

1. Reverted changes in XNNPACK delegate README.
2. Made some adjustments for XNNPACK tests code suggested by the MR
   reviewer.
@dev0x13
Copy link
Contributor Author

dev0x13 commented Nov 17, 2021

@Maratyszcza Thank you! All the suggested changes are made.

@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Nov 19, 2021
current_dim.data(), 0, nullptr);
const int channel_idx = current_dim[quantized_dimension];
const float val = data[offset];
if (min[channel_idx] > val) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines can be further simplified using std::min, std::max

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Nov 19, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Nov 19, 2021
@Maratyszcza
Copy link
Contributor

LGTM. @gbaned could you merge?

@copybara-service copybara-service bot merged commit 15d21b8 into tensorflow:master Nov 24, 2021
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Nov 24, 2021
copybara-service bot pushed a commit that referenced this pull request Nov 24, 2021
PiperOrigin-RevId: 411934765
Change-Id: Ida9cd3723742a3b92139345fccc29b60d5383be0
@dev0x13
Copy link
Contributor Author

dev0x13 commented Nov 26, 2021

I noticed that the PR was rolled back right after merge. Is there something wrong with it?

@Maratyszcza
Copy link
Contributor

transpose_conv_test failed under sanitizers with UndefinedBehaviorSanitizer: float-cast-overflow tensorflow/lite/delegates/xnnpack/test_util.cc:32:28

@dev0x13
Copy link
Contributor Author

dev0x13 commented Nov 27, 2021

@Maratyszcza Fixed this in my fork, but it seems that this PR cannot be reopened. What do I need to do in order to submit this fix to make my changes merged again? Thank you in advance!

@dev0x13
Copy link
Contributor Author

dev0x13 commented Dec 3, 2021

@Maratyszcza I am really sorry for bothering you, but could you clarify what I need to do in order to submit this fix to make my changes merged again?

@Maratyszcza
Copy link
Contributor

prelu_test fails under asan too, with the same error

@dev0x13
Copy link
Contributor Author

dev0x13 commented Dec 6, 2021

@Maratyszcza Fixed

@Maratyszcza
Copy link
Contributor

Re-landed in a6d352f

@dev0x13
Copy link
Contributor Author

dev0x13 commented Dec 9, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:lite TF Lite related issues size:M CL Change Size: Medium
Projects
PR Queue
  
Reviewer Requested Changes
Development

Successfully merging this pull request may close these issues.

None yet

6 participants