Optimize w8a8 quantized matmul kernel #9412

vanbasten23 · 2025-06-26T05:27:53Z

This PR

updated the block table.
fall back to xla w8a8 quantized matmul if the block sizes are not found.

Test plan:

pytest pytorch/xla/test/test_quantized_matmul_pallas_kernel.py -s
python pytorch/xla/test/test_pallas.py -k test_quantized_matmul_int8

…e get_tuned_block_sizes function; make throw an error in get_tuned_block_sizes but it needs to be reverted.

…tmul

yaochengji

Thanks Xiongfei for your conribution!

test/test_pallas.py

torch_xla/experimental/custom_kernel.py

yaochengji

LGTM, thanks for the contribution!

vanbasten23 · 2025-06-30T18:23:27Z

The CI has been failing prior to this PR (eg #9415) and seems irrelevant to this PR.

vanbasten23 · 2025-07-01T15:48:44Z

I also created an empty change #9430 and the same CI also fails.

vanbasten23 added 6 commits June 26, 2025 04:35

Use kernel v4.

9afb95e

add a test for the case where the key is not in the table; correct th…

929d611

…e get_tuned_block_sizes function; make throw an error in get_tuned_block_sizes but it needs to be reverted.

If block is not found in the table, fall back to the xla quantized ma…

738e7f0

…tmul

not to fetch block size in the kernel

9d33501

Still use v1

450fc98

updated the wrapper

eac1d57

vanbasten23 marked this pull request as ready for review June 27, 2025 23:20

yaochengji reviewed Jun 28, 2025

View reviewed changes

test/test_pallas.py Outdated Show resolved Hide resolved

torch_xla/experimental/custom_kernel.py Show resolved Hide resolved

fix comment

a9f8edf

vanbasten23 requested a review from yaochengji June 28, 2025 02:05

yaochengji approved these changes Jun 30, 2025

View reviewed changes

yaochengji merged commit 4101ea5 into master Jul 1, 2025
40 of 42 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize w8a8 quantized matmul kernel #9412

Optimize w8a8 quantized matmul kernel #9412

vanbasten23 commented Jun 26, 2025 •

edited

Loading

Uh oh!

yaochengji left a comment

Uh oh!

Uh oh!

Uh oh!

yaochengji left a comment

Uh oh!

vanbasten23 commented Jun 30, 2025

Uh oh!

vanbasten23 commented Jul 1, 2025

Uh oh!

Uh oh!

Uh oh!

Optimize w8a8 quantized matmul kernel #9412

Optimize w8a8 quantized matmul kernel #9412

Conversation

vanbasten23 commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yaochengji left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yaochengji left a comment

Choose a reason for hiding this comment

Uh oh!

vanbasten23 commented Jun 30, 2025

Uh oh!

vanbasten23 commented Jul 1, 2025

Uh oh!

Uh oh!

Uh oh!

vanbasten23 commented Jun 26, 2025 •

edited

Loading