[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors #13322

vvchernov · 2022-11-08T12:21:13Z

QLinearMatMul has supported rank =2 only for both input tensors.
It was extended using _qnn.op.dense and _qnn.op.batch_matmul for all ranks
Y = X*W
Works:

int8 and int8 or uint8 and uint8 input data types
x_rank = 1, w_rank = 2
x_rank = 2, w_rank = 2
x_rank > 2, w_rank = 2
x_rank = any >= w_rank > 2

Note: Different types of input tensors (int8 and uint8) does not work currently

tvm-bot · 2022-11-08T12:21:16Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @KJlaccHoeUM9l, @ehsanmok _{See #10317 for details}
Built docs for commit f92ca66 can be found here.

_{Generated by tvm-bot}

…s of input tensors (apache#13322) * QLinearMatMul was extended for all ranks of a and b * CI test for QLinearMatMul was implemented (onnx front-end) * fix after black check * numpy type fix * fix weight scale and zero point, output type * fix after pylint * resolve different input types in tests * skip resolved TODO * update covering of QLinearMatMul by tests * pylint fixes * skip test of QLinearMatMul on CUDA Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>

vvchernov added 2 commits November 8, 2022 14:45

QLinearMatMul was extended for all ranks of a and b

28bad4e

CI test for QLinearMatMul was implemented (onnx front-end)

63a42f1

vvchernov added 8 commits November 8, 2022 16:10

fix after black check

894532e

numpy type fix

797ada8

fix weight scale and zero point, output type

3cf95cb

fix after pylint

d5f803a

resolve different input types in tests

c0fdfd9

skip resolved TODO

f0101e7

update covering of QLinearMatMul by tests

3deef8b

pylint fixes

1a241af

vvchernov changed the title ~~WIP: [QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors~~ [QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors Nov 9, 2022

skip test of QLinearMatMul on CUDA

f92ca66

masahi approved these changes Nov 10, 2022

View reviewed changes

masahi merged commit b4b90d7 into apache:main Nov 10, 2022

masahi mentioned this pull request Dec 8, 2022

[Feature Request] Support for quantized matmul for > 2d inputs for ONNX models. #12701

Closed

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

vvchernov deleted the vc/QLinearMatMul branch February 24, 2023 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors #13322

[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors #13322

vvchernov commented Nov 8, 2022 •

edited

tvm-bot commented Nov 8, 2022 •

edited

[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors #13322

[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all ranks of input tensors #13322

Conversation

vvchernov commented Nov 8, 2022 • edited

tvm-bot commented Nov 8, 2022 • edited

vvchernov commented Nov 8, 2022 •

edited

tvm-bot commented Nov 8, 2022 •

edited