[torch][quant] Quantized `torch.mm` for linalg with end-to-end test #2750

rsuderman · 2024-01-13T06:25:09Z

This includes custom op matching for decomposed operations and fusing
dequantization into dense operations. As a validation we compare
to the dequant+mm torch implementation.

dan-garvey · 2024-01-16T02:48:25Z

Wow what a patch.

rsuderman · 2024-01-17T20:14:57Z

Wow what a patch.

Meanwhile I have flashbacks of Ben's 20k+ line PRs.

stellaraccident

Nice: Left a few nits but otherwise looks good.

include/torch-mlir/Dialect/Torch/Transforms/Passes.td

lib/Dialect/Torch/Transforms/MatchQuantizedOps.cpp

projects/ltc/csrc/base_lazy_backend/shape_inference.cpp

This includes custom op matching for decomposed operations and fusing dequantization into dense operations. As a validation we compare to the dequant+mm torch implementation.

rsuderman requested a review from stellaraccident January 13, 2024 06:25

rsuderman force-pushed the quant_mm_rebase branch from 2d9d9d4 to 902de39 Compare January 13, 2024 06:28

rsuderman requested a review from vivekkhandelwal1 January 15, 2024 18:38

rsuderman force-pushed the quant_mm_rebase branch from 902de39 to b40060c Compare January 20, 2024 00:05

kumardeepakamd mentioned this pull request Jan 23, 2024

Shark FE : Support bfloat16/int8 opt/laama2-7b Fx and ONNX model nod-ai/SHARK-ModelDev#364

Open

stellaraccident approved these changes Jan 23, 2024

View reviewed changes

rsuderman force-pushed the quant_mm_rebase branch from b40060c to 8e83f87 Compare January 24, 2024 20:34

rsuderman added 2 commits January 24, 2024 12:37

[torch][quant] Quantized torch.mm for linalg with end-to-end test

ac83dc4

This includes custom op matching for decomposed operations and fusing dequantization into dense operations. As a validation we compare to the dequant+mm torch implementation.

addressed review comments

4cbb02b

rsuderman force-pushed the quant_mm_rebase branch from 8e83f87 to 4cbb02b Compare January 24, 2024 20:37

fix ltc tests

31d36dd

rsuderman merged commit f6f8905 into llvm:main Jan 24, 2024
5 checks passed

rsuderman deleted the quant_mm_rebase branch February 28, 2024 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torch][quant] Quantized `torch.mm` for linalg with end-to-end test #2750

[torch][quant] Quantized `torch.mm` for linalg with end-to-end test #2750

rsuderman commented Jan 13, 2024

dan-garvey commented Jan 16, 2024

rsuderman commented Jan 17, 2024

stellaraccident left a comment

[torch][quant] Quantized torch.mm for linalg with end-to-end test #2750

[torch][quant] Quantized torch.mm for linalg with end-to-end test #2750

Conversation

rsuderman commented Jan 13, 2024

dan-garvey commented Jan 16, 2024

rsuderman commented Jan 17, 2024

stellaraccident left a comment

Choose a reason for hiding this comment

[torch][quant] Quantized `torch.mm` for linalg with end-to-end test #2750

[torch][quant] Quantized `torch.mm` for linalg with end-to-end test #2750