Skip to content

Conversation

supriyar
Copy link
Contributor

@supriyar supriyar commented Jan 22, 2020

Stack from ghstack:

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D19542980

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@kostmo
Copy link
Member

kostmo commented Jan 22, 2020

💊 CircleCI build failures summary and remediations

As of commit febb8b4:

Commit febb8b4 was recently pushed. Waiting for builds...


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 19 times.

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 22, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c90df5a
Pull Request resolved: #32479
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 22, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: edd8aaa
Pull Request resolved: #32479
Copy link

@z-a-f z-a-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add benchmarks under r/benchmarks/operator_benchmark/pt/?

double scale =
(std::max(max, 0.f) - std::min(min, 0.f)) / ((double)qmax - qmin);
if (scale == 0) {
scale = 0.1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this magic number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FBGEMM uses this to avoid divide by 0 errors.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is why 0.1? Why not 1.0 or qmax-qmin? I don't have preference one way or another, but this case happens only if in all zeros tensors, and I wonder if we should standardize it everywhere?

Comment on lines 247 to 254
int32_t nudged_zero_point = 0;
if (initial_zero_point < qmin) {
nudged_zero_point = qmin;
} else if (initial_zero_point > qmax) {
nudged_zero_point = qmax;
} else {
nudged_zero_point = nearbyint(initial_zero_point);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example when this part is necessary? I cannot see the necessity of nudging the ZP if we picked the scale properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible for the initial_zero_point computed to be non-integral. let's say min is 5.5 and max is 555.5 - In this case the computed zero point will have to be nudged to get an integral value.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, the nudging cases will only be necessary if we don't have guarantees of zero being representable, but we do (kinda). Here is my logic:

Suppose initial_zero_point = qmin - min / scale. We guarantee that min is either negative or zero, which makes initial_zero_point < qmin to be always false. If we suppose that initial_zero_point = qmax - max / scale, the logic is the same: max is either positive or zero. Hence, initial_zero_point > qmax is always false, so the second condition in the if statement will never be executed.

We can discuss it offline :)

@supriyar
Copy link
Contributor Author

Should we also add benchmarks under r/benchmarks/operator_benchmark/pt/?

It wouldn't run them on mobile, so I don't see the point. Unless you meant to benchmark the mobile operators on server.

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 23, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 970e0de
Pull Request resolved: #32479
@supriyar supriyar requested a review from z-a-f January 23, 2020 01:11
return
use_channelwise = False
use_multi_dim_input = False
use_relu = False
Copy link
Contributor

@raghuramank100 raghuramank100 Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add support for LinearRelu to be consistent with server implementation?
Also, why is there a restriction on multi_dim_input support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it in a subsequent PR. We will need to modify the kernels as well to add this.

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D19542980](https://our.internmc.facebook.com/intern/diff/D19542980)

[ghstack-poisoned]
Copy link
Member

@jianyuh jianyuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

int precision;
};

inline TensorQuantizationParams ChooseQuantizationParams(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D19542980](https://our.internmc.facebook.com/intern/diff/D19542980)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 24, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a8e4127
Pull Request resolved: #32479
@supriyar supriyar requested a review from jianyuh January 24, 2020 04:28
Copy link
Member

@jianyuh jianyuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D19542980](https://our.internmc.facebook.com/intern/diff/D19542980)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 24, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 622fd9e
Pull Request resolved: #32479
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D19542980](https://our.internmc.facebook.com/intern/diff/D19542980)

[ghstack-poisoned]
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D19542980](https://our.internmc.facebook.com/intern/diff/D19542980)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jan 24, 2020
Summary:
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: f408bbc
Pull Request resolved: #32479
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1695418.

@facebook-github-bot facebook-github-bot deleted the gh/supriyar/52/head branch January 28, 2020 15:17
wuhuikx pushed a commit to wuhuikx/pytorch that referenced this pull request Jan 30, 2020
Summary:
Pull Request resolved: pytorch#32479

Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Imported from OSS

Differential Revision: D19542980

fbshipit-source-id: c9f6e5e8ded4d62ae0f2ed99e478c8307dde22ed
Comment on lines +103 to +110
int32_t nudged_zero_point = 0;
if (initial_zero_point < qmin) {
nudged_zero_point = qmin;
} else if (initial_zero_point > qmax) {
nudged_zero_point = qmax;
} else {
nudged_zero_point = nearbyint(initial_zero_point);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still nudging? I thought I explained why this is not necessary

ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
Summary:
Pull Request resolved: pytorch#32479

Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator

Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear

Imported from OSS

Differential Revision: D19542980

fbshipit-source-id: c9f6e5e8ded4d62ae0f2ed99e478c8307dde22ed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants