[Relay] Convert a fake quantized or QAT graph into QNN ops #8126

mbrookhart · 2021-05-25T16:42:48Z

Recently, we discovered that tf2onnx is exporting some int8 graphs as fake quantized/QAT models in ONNX, i.e, int8 ops are exported as dequantize->op->quantize.

This PR introduces a pass to convert those graphs into direct int8 ops inside relay. I've tested correctness of the resulting models on Inceptionv1 and ssd-mobilenet-v1 from the tensorflow lite model zoo imported via ONNX. Follow up work will analyze further models for more operations to include in this pass.

cc @AndrewZhaoLuo @masahi @jwfromm

masahi · 2021-05-25T19:26:29Z

Very nice!! cc @anijain2305 @electriclilies

I wonder if "quantize" is the best verb for saying "rewrite fake quantized graphs into real integer-quantized ones". But I don't have a better alternative either.

tests/python/relay/test_pass_quantize_fake_quantization.py

src/relay/transforms/quantize_fake_quantization.cc

python/tvm/relay/transform/quantize_fake_quantization.py

mbrookhart · 2021-06-01T16:10:02Z

@masahi @anijain2305 Any thoughts on naming? I still don't love what I have, but I agree with Masa that I haven't been able to come up with anything better...

electriclilies · 2021-06-01T17:38:15Z

I think the repetition of the word quantize is in quantize_fake_quantization is confusing -- maybe you could avoid using the word quantize as a verb and name it something like fake_quantize_to_int8? That doesn't have an active verb in it at all, though, which is also sub-optimal. And, if this gets expanded to other dtypes you'd need to change the name..

mbrookhart · 2021-06-03T17:51:42Z

Hmm, that's an interesting idea. To throw other random thoughts out:
fake_quantize_to_integer
fake_quantization_to_affine_space
propagate_integer_ops

Any thoughts? Rebased to get around a weird threading bug in another CI test, if people have a naming preference I can refactor while the CI runs to make sure the pass it working.

electriclilies · 2021-06-03T19:26:24Z

@mbrookhart I think fake_quantization_to_affine_space and fake_quantization_to_integer are the best options. I slightly preferfake_quantization_to_integer because it's a bit more concise.

Also, I did a quick google search and I think that the term affine space is used when talking about quantization in physics, but I didn't see any references to it in computer science literature. The only thing that comes up if you search "affine space" is stuff about vector spaces, and if you search "quantization affine space" you get physics papers.

So I think if we do use the term affine space, we should be careful to explain what we mean by it in code comments and documentation since it's not a term that is commonly used.

mbrookhart · 2021-06-03T19:51:43Z

I'm a physicist, that must be why that term makes so much more sense to me :D

mbrookhart · 2021-06-03T19:52:02Z

But I'm happy to use fake_quantization_to_integer

masahi · 2021-06-03T20:33:14Z

I also prefer fake_quantization_to_integer. I usually don't associate the word "affine" with integers, I think it is more commonly used when talking about affine transform.

anijain2305 · 2021-06-03T20:54:53Z

Thanks, this is nice addition and improves framework coverage very nicely. I agree that fake_quantization_to_integer is more natural. I have typically used affine for loop transformations.

mbrookhart · 2021-06-03T21:28:13Z

Awesome, thanks everyone, I'll refactor to that name.

mbrookhart · 2021-06-04T15:18:32Z

Refactor done. Thanks!

anijain2305 · 2021-06-04T22:51:22Z

@masahi @electriclilies Please approve explicitly when you get a chance. And we can land this.

electriclilies

Overall looks good to me, one nitpick is that you use the word "affine" in a bunch of the documentation and some of the internal names without defining it, which I do think is important since it is a term that isn't used in computer science often. I think at least defining it in the documentation of AffineType and in the documentation of register_fake_quantization_to_integer would be good.
I don't think it's super important though so you could add the definitions in a later PR.

python/tvm/relay/op/op.py

src/relay/transforms/fake_quantization_to_integer.cc

anijain2305 · 2021-06-07T18:13:12Z

@mbrookhart Feel free to merge the PR as you decide if want to address Lily's comments in this or next PR. All good from my side.

mbrookhart · 2021-06-07T19:12:16Z

@electriclilies Thanks for the suggestions, I added a definition for completeness, I don't want to confuse users. I think that it is a fairly common term in the quantization literature, though, see, for instance,
https://arxiv.org/pdf/1712.05877.pdf
https://arxiv.org/pdf/2004.09602.pdf

…to_integer

mbrookhart · 2021-06-08T19:32:12Z

Thanks @masahi @anijain2305 @electriclilies

* Convert a fake quantized or QAT graph into qnn ops * fix pylint * fix typos * use an identify function for some ops * rename the pass from quantize_fake_quantization to fake_quantization_to_integer * add definition for affine

masahi reviewed May 25, 2021

View reviewed changes

tests/python/relay/test_pass_quantize_fake_quantization.py Outdated Show resolved Hide resolved

masahi reviewed May 25, 2021

View reviewed changes

src/relay/transforms/quantize_fake_quantization.cc Outdated Show resolved Hide resolved

masahi reviewed May 25, 2021

View reviewed changes

src/relay/transforms/quantize_fake_quantization.cc Outdated Show resolved Hide resolved

masahi reviewed May 25, 2021

View reviewed changes

python/tvm/relay/transform/quantize_fake_quantization.py Outdated Show resolved Hide resolved

mbrookhart force-pushed the fake_quantized_to_quantized branch from ad608d4 to 2d52eed Compare June 3, 2021 17:46

tqchen assigned anijain2305 and masahi Jun 4, 2021

mbrookhart force-pushed the fake_quantized_to_quantized branch from 1451a1b to 92952bb Compare June 4, 2021 15:40

anijain2305 approved these changes Jun 4, 2021

View reviewed changes

masahi approved these changes Jun 4, 2021

View reviewed changes

electriclilies approved these changes Jun 7, 2021

View reviewed changes

python/tvm/relay/op/op.py Outdated Show resolved Hide resolved

src/relay/transforms/fake_quantization_to_integer.cc Show resolved Hide resolved

Matthew added 5 commits June 8, 2021 08:38

Convert a fake quantized or QAT graph into qnn ops

2326147

fix pylint

b7ee25d

fix typos

02fd4dc

use an identify function for some ops

5dc1afd

rename the pass from quantize_fake_quantization to fake_quantization_…

734a0d8

…to_integer

add definition for affine

c150e4f

mbrookhart force-pushed the fake_quantized_to_quantized branch from 4617b8c to c150e4f Compare June 8, 2021 14:38

mbrookhart merged commit 9be0f4f into apache:main Jun 8, 2021

mbrookhart deleted the fake_quantized_to_quantized branch June 8, 2021 19:32

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay] Convert a fake quantized or QAT graph into QNN ops #8126

[Relay] Convert a fake quantized or QAT graph into QNN ops #8126

mbrookhart commented May 25, 2021

masahi commented May 25, 2021

mbrookhart commented Jun 1, 2021

electriclilies commented Jun 1, 2021

mbrookhart commented Jun 3, 2021

electriclilies commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

masahi commented Jun 3, 2021

anijain2305 commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

mbrookhart commented Jun 4, 2021

anijain2305 commented Jun 4, 2021

electriclilies left a comment

anijain2305 commented Jun 7, 2021

mbrookhart commented Jun 7, 2021 •

edited

Loading

mbrookhart commented Jun 8, 2021

[Relay] Convert a fake quantized or QAT graph into QNN ops #8126

[Relay] Convert a fake quantized or QAT graph into QNN ops #8126

Conversation

mbrookhart commented May 25, 2021

masahi commented May 25, 2021

mbrookhart commented Jun 1, 2021

electriclilies commented Jun 1, 2021

mbrookhart commented Jun 3, 2021

electriclilies commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

masahi commented Jun 3, 2021

anijain2305 commented Jun 3, 2021

mbrookhart commented Jun 3, 2021

mbrookhart commented Jun 4, 2021

anijain2305 commented Jun 4, 2021

electriclilies left a comment

Choose a reason for hiding this comment

anijain2305 commented Jun 7, 2021

mbrookhart commented Jun 7, 2021 • edited Loading

mbrookhart commented Jun 8, 2021

mbrookhart commented Jun 7, 2021 •

edited

Loading