[Dev][AMD] Support LDS and Flash Attention for AMD Backend #247

LeiWang1999 · 2024-11-19T06:34:37Z

This pull request includes several changes to the benchmarking scripts and the matrix multiplication and multi-head attention implementations, as well as updates to the mfma_macro_generator.py file to support different thread binding layouts. The most important changes include updating the submodule commit, adding new benchmarking scripts, and modifying the mfma_macro_generator.py to support different thread binding layouts.

Benchmarking updates:

benchmark/tilelang/benchmark.sh: Added multiple new benchmarking commands for different matrix dimensions.
benchmark/tilelang/benchmark_tilelang_matmul.py: Added a new script for benchmarking matrix multiplication with various configurations.
benchmark/tilelang/benchmark_tilelang_mha.py: Added a new script for benchmarking multi-head attention with various configurations.

Matrix multiplication and multi-head attention implementations:

bitblas/tl/mfma_macro_generator.py: Added support for different thread binding layouts by introducing the is_m_first flag and modifying methods to use this flag. [1] [2] [3] [4] [5] [6]

Code simplification and cleanup:

bitblas/tl/mfma_layout.py: Removed an unused import and added new functions for different thread binding layouts. [1] [2]
bitblas/tl/utils.py: Updated imports and modified the mfma_store_index_map function to use the new thread binding layout function. [1] [2]

Submodule update:

3rdparty/tvm: Updated the submodule commit to a new version.

LeiWang1999 added 30 commits October 16, 2024 19:14

Refactor Simplify function to handle multiple functions in IRModule

c4853ec

Update submodule commit reference

9a21acf

Add CUDA_DEVICE_ORDER environment variable to bashrc

f8d046b

test fix

c1371dd

lint fix

416cad2

Refactor test_general_matmul_bf16.py to use bitblas.testing.main()

9209d1e

Update submodule commit reference

1cf7570

Update Ubuntu version in install scripts based on LLVM version

5fec040

Update Ubuntu version in install scripts based on LLVM version

4e1a0d2

Update submodule commit reference

fa85f8c

Update submodule commit reference

429d5b5

Update submodule commit reference

4003509

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

1d86582

Update submodule commit reference

df3af0d

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

1f1e027

Update submodule commit reference

732dda6

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

ebffbfa

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

ff227fa

[Dev] Update subproject commit for TVM

ac62936

ignore profiler directories.

a7a239c

MFMA Support

dcedbde

lint fix

e0b36f5

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

fe668f9

MFMA Fixed.

3579c6b

merge upstream

e60ccd9

update

d4df21c

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

e4ff7f3

Fix MFMA Layout Related issue

57e3cf9

lint fix

c3398f5

amd hip update

ddd0219

LeiWang1999 and others added 14 commits November 13, 2024 16:41

Block GEMM Example

754294f

fix amd

e041d91

mi300 update

2910b3c

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

cf934c1

fix and enhance

baf4abd

lintfix

c3605e8

enhance amd installation

8ee2c63

update submodule

451318a

update tvm

14672df

implement fragement

ff7f6d8

test update

cccbe68

Optimize MFMA Layout

9edecfd

lint fix

7b5d4d5

Merge branch 'main' of https://github.com/microsoft/BitBLAS into amd_hip

41c2fb7

LeiWang1999 merged commit b481405 into microsoft:main Nov 19, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dev][AMD] Support LDS and Flash Attention for AMD Backend #247

[Dev][AMD] Support LDS and Flash Attention for AMD Backend #247

Uh oh!

LeiWang1999 commented Nov 19, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Dev][AMD] Support LDS and Flash Attention for AMD Backend #247

[Dev][AMD] Support LDS and Flash Attention for AMD Backend #247

Uh oh!

Conversation

LeiWang1999 commented Nov 19, 2024

Benchmarking updates:

Matrix multiplication and multi-head attention implementations:

Code simplification and cleanup:

Submodule update:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants