[Dev] Dequante SIMT Matmul Implementation. #188

LeiWang1999 · 2024-09-20T06:37:13Z

Introduce efficient (but not the perfect) matmul schedule for int8 simt schedule for T4 Cards.

This pull request includes several changes to the bitblas/gpu/matmul.py file, introducing new scheduling rules and optimizations for GPU operators, as well as updates to the bitblas/gpu/matmul_analysis.py and integration/BitNet/eval_correctness.py files. The main changes involve the addition of a new scheduling method for dequantization, type and import adjustments, and version checks for compatibility.

New Scheduling Method:

Added sch_dequantize_in_register_with_config method to handle dequantization scheduling without shared memory prefetch for devices lacking async copy. (bitblas/gpu/matmul.py)

Type and Import Adjustments:

Updated imports to include List and suppress, and added get_coalesced_veclen from ..base.analysis. (bitblas/gpu/matmul.py)
Added _collect_producers to the list of imports from matmul_analysis. (bitblas/gpu/matmul.py)

Configuration and Typo Fixes:

Fixed a typo in the calculation of thread_row_tiles by using config.thread[0] instead of config.thread[1]. (bitblas/gpu/matmul.py)
Added a check for dequantize_info in the apply_config method to call the new dequantization schedule if present. (bitblas/gpu/matmul.py)

Analysis Updates:

Modified analysis_tensorcore_tags to return a Union of bool and Dict and added a check for Tensor Core support based on the SM version. (bitblas/gpu/matmul_analysis.py) [1] [2]
Added a threshold for minimal tensorize based on tensor core bit width. (bitblas/gpu/matmul_analysis.py)

Version Check:

Added a version check for the transformers library to ensure compatibility, asserting the version is <= 4.40.0. (integration/BitNet/eval_correctness.py)

LeiWang1999 added 2 commits September 18, 2024 06:27

fix for int8 gemm

4183ae1

T4

e64c91d

LeiWang1999 merged commit 4106902 into microsoft:main Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dev] Dequante SIMT Matmul Implementation. #188

[Dev] Dequante SIMT Matmul Implementation. #188

Uh oh!

LeiWang1999 commented Sep 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Dev] Dequante SIMT Matmul Implementation. #188

[Dev] Dequante SIMT Matmul Implementation. #188

Uh oh!

Conversation

LeiWang1999 commented Sep 20, 2024

New Scheduling Method:

Type and Import Adjustments:

Configuration and Typo Fixes:

Analysis Updates:

Version Check:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant