Refactor the deep tile matmul config and skip the single-iteration loop generation #309

zhczhong · 2024-09-02T02:04:55Z

Track: #288

The extra memref.copy is caused by write-after-write conflict because the canonicalization pass eliminates the single-iteration loop but preserves the extracted slice and insert slice. So here skip generating them when they are single-iteration loop.
Refactor the config. Infer dimType according to the contractionOpInterface.
introduce padding cost, which minimizes the cost on padding and use divisible block if possible

yifeizh2 · 2024-09-02T02:52:10Z

lib/gc/Analysis/MatmulConfigAnalysis.cpp

+    innerMostKBlockCandidates = {16, 32, 64};
+    innerMostNBlockCandidates = {16, 32, 64};
+    NBlockCandidates = innerMostNBlockCandidates;
+    KBlockCandidates = innerMostKBlockCandidates;


So after the change here, innermost Kblock will only be one of 16/32/64 if allowIndivisibleInnerblock is true?

Yes, your understanding is correct

lib/gc/Analysis/MatmulConfigAnalysis.cpp

lib/gc/Transforms/DeepTileContractionOp.cpp

yifeizh2 · 2024-09-03T03:59:27Z

include/gc/Analysis/MatmulConfigAnalysis.h


+inline void getDimTypeFromIterators(linalg::LinalgOp linalgOp,
+                                    SmallVectorImpl<DimType> &dimTypes) {
+  SmallVector<utils::IteratorType> iteratorTypes =


Can you explicitly specify mlir::utils::IteratorType here?

add mlir namespace here

ciyongch · 2024-09-04T03:05:23Z

Overall LGTM, do you think we should add a single simple case to cover the case of skipping single-iteration loop generation?

zhczhong · 2024-09-04T03:10:51Z

Overall LGTM, do you think we should add a single simple case to cover the case of skipping single-iteration loop generation?

This test could cover this case. The file check will check two scf.forall before this PR but the single-iteration scf.forall will be skipped now.

graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir

Line 127 in 613a333

    
           func.func @matmul_2Dx4D_bf16_with_dlti(%arg0: tensor<4096x4096xbf16>, %arg1: tensor<128x128x16x32x2xbf16>) -> tensor<4096x4096xbf16> {

ciyongch · 2024-09-04T04:38:09Z

Please help to rebase the code base, then we can merge it.

zhczhong · 2024-09-04T04:49:43Z

Please help to rebase the code base, then we can merge it.

The code has been rebased now

zhczhong added the ready to review label Sep 2, 2024

zhczhong requested review from Menooker, ZhennanQin, ciyongch and yifeizh2 September 2, 2024 02:04

yifeizh2 reviewed Sep 2, 2024

View reviewed changes

zhczhong force-pushed the zhcong/enhance_config branch 2 times, most recently from 073ec23 to 0c56c7d Compare September 2, 2024 05:13

ciyongch reviewed Sep 3, 2024

View reviewed changes

lib/gc/Analysis/MatmulConfigAnalysis.cpp Outdated Show resolved Hide resolved

lib/gc/Transforms/DeepTileContractionOp.cpp Show resolved Hide resolved

lib/gc/Transforms/DeepTileContractionOp.cpp Outdated Show resolved Hide resolved

zhczhong added 10 commits September 2, 2024 19:59

update config

a351ef1

fix extra memref.copy introduced by dummy loop

bd6d2fc

fix small size issue

4abb40c

refactor deep tile code

5db159c

infer the dimtype from contraction op interface

0f0231c

fix file check

0df26c3

add comment

b395f24

fix

e04ea30

introduce padding cost

f19e0fb

fix comment

01920b6

zhczhong force-pushed the zhcong/enhance_config branch from 0c56c7d to 01920b6 Compare September 3, 2024 02:59

yifeizh2 reviewed Sep 3, 2024

View reviewed changes

fix comment

613a333

Merge branch 'main' into zhcong/enhance_config

578d820

ciyongch approved these changes Sep 4, 2024

View reviewed changes

yifeizh2 approved these changes Sep 4, 2024

View reviewed changes

zhczhong merged commit 3f04dc9 into main Sep 4, 2024

lmontigny mentioned this pull request Sep 6, 2024

Extra memref.copy introduced by bufferization #288

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor the deep tile matmul config and skip the single-iteration loop generation #309

Refactor the deep tile matmul config and skip the single-iteration loop generation #309

Uh oh!

zhczhong commented Sep 2, 2024 •

edited

Loading

Uh oh!

yifeizh2 Sep 2, 2024

Uh oh!

zhczhong Sep 2, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yifeizh2 Sep 3, 2024

Uh oh!

zhczhong Sep 3, 2024

Uh oh!

ciyongch commented Sep 4, 2024

Uh oh!

zhczhong commented Sep 4, 2024

Uh oh!

ciyongch commented Sep 4, 2024

Uh oh!

zhczhong commented Sep 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Refactor the deep tile matmul config and skip the single-iteration loop generation #309

Refactor the deep tile matmul config and skip the single-iteration loop generation #309

Uh oh!

Conversation

zhczhong commented Sep 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yifeizh2 Sep 2, 2024

Choose a reason for hiding this comment

Uh oh!

zhczhong Sep 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yifeizh2 Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

zhczhong Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

ciyongch commented Sep 4, 2024

Uh oh!

zhczhong commented Sep 4, 2024

Uh oh!

ciyongch commented Sep 4, 2024

Uh oh!

zhczhong commented Sep 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zhczhong commented Sep 2, 2024 •

edited

Loading