[Navi3x] Add Device Operations by aska-0096 · Pull Request #567 · ROCm/composable_kernel

aska-0096 · 2023-01-30T10:39:30Z

Three Device Operations Added

1. DeviceGemmMultipleD_Wmma_Cshuffle

example_gemm_bilinear_wmma_fp16

2. DeviceBatchedContractionMultipleD_Wmma_Cshuffle

example_batched_gemm_bias_e_permute_wmma_fp16

3. DeviceGroupedConvFwdMultipleD_Wmma_Cshuffle

example_grouped_conv_fwd_bias_relu_add_wmma_fp16
Performance of Depthwise convolution is very low, need further optimization

Above example passed with latest version of amd-stg-open compiler

…posable_kernel into wmma_op

…posable_kernel into wmma_gemm

asroy · 2023-01-30T18:40:01Z

@aska-0096 Does this PR also pass on current compiler used by CI? If not, we may need to update compiler on CI again

cc @illsilin

illsilin · 2023-01-30T18:50:37Z

Looks like we got a couple of new test failures in CI for this branch:

[2023-01-30T12:16:01.378Z] The following tests FAILED:
[2023-01-30T12:16:01.378Z] 11 - example_gemm_bilinear_wmma_fp16 (Child aborted)
[2023-01-30T12:16:01.378Z] 80 - example_grouped_conv_fwd_bias_relu_add_wmma_fp16 (Child aborted)

aska-0096 · 2023-01-31T07:59:42Z

Hi @asroy @illsilin
Confirmed that two examples mentioned above failed on 5.3.1, works on 5.4.1 with rather lower performance than 5.5.0 or latest amd-stg-open.

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

aska-0096 · 2023-02-06T03:01:55Z

Hi @illsilin @asroy
Do we have progress on upgrading CI compiler? As AITemplate side need these operations to enable Resnet50 on Navi3x.

illsilin · 2023-02-07T17:50:45Z

I have updated the CI compiler. Please sync your branch with develop branch.

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

aska-0096 · 2023-02-08T07:03:43Z

The docker I triggered still the rocm/composable_kernel:ck_ub20.04_rocm5.3_release, no diff with older one. I think the compiler has not been upgraded yet. @illsilin

illsilin · 2023-02-08T15:48:02Z

What happened is, I changed the default values for compiler in jenkins parameters. Usually it takes 20-30 minutes for Jenkins to update those after the change has been merged. So if CI is launched before that, it will still use old default values. I'll restart your branch manually now and it will use the new compiler defaults.

illsilin · 2023-02-08T18:23:30Z

OK, so the results are in: there are 3 failures:

[2023-02-08T17:13:42.060Z] The following tests FAILED:
[2023-02-08T17:13:42.060Z] 11 - example_gemm_bilinear_wmma_fp16 (Child aborted)
[2023-02-08T17:13:42.060Z] 80 - example_grouped_conv_fwd_bias_relu_add_wmma_fp16 (Child aborted)
[2023-02-08T17:13:42.060Z] 150 - test_grouped_convnd_bwd_weight (Failed)

Test 150 seems sensitive, I re-ran it locally with your branch and it passed. in the CI test results were just different from baseline by 1, 879 vs 880. So most likely a round-off error.

The other two tests, however, should not have been launched on MI100/200. So you need to add a check somewhere to make sure those tests are only triggered "#if defined(gfx1100)".

illsilin · 2023-02-08T19:00:01Z

One option is to make sure those tests are only built if the appropriate GPU architecture is on the list of targets:

diff --git a/example/02_gemm_bilinear/CMakeLists.txt b/example/02_gemm_bilinear/CMakeLists.txt
index 425029c0..6266af0a 100644
--- a/example/02_gemm_bilinear/CMakeLists.txt
+++ b/example/02_gemm_bilinear/CMakeLists.txt
@@ -1,2 +1,4 @@
add_example_executable(example_gemm_bilinear_xdl_fp16 gemm_bilinear_xdl_fp16.cpp)
-add_example_executable(example_gemm_bilinear_wmma_fp16 gemm_bilinear_wmma_fp16.cpp)
+if(GPU_TARGETS MATCHES gfx1100)

add_example_executable(example_gemm_bilinear_wmma_fp16 gemm_bilinear_wmma_fp16.cpp)
+endif()

aska-0096 · 2023-02-09T03:47:10Z

Interesting, I confirmed the CI failed due to example running on the unsupported GPU. However, 2 of 4 WMMA including examples passed without compile/runtime error.
Let me try your suggestion to add arch-limitation first like what I do in the test folder.

…posable_kernel into PR567

aska-0096 · 2023-02-11T06:03:12Z

Hi @illsilin, CI passed.
cc: @asroy

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

aska-0096 · 2023-02-15T04:07:21Z

@asroy
Just a reminder about this PR, I believe the device ops added in the PR would be a valuable addition to the AIT side guys.
I would be grateful if you could take a look at it and consider merging it.

aska-0096 added 30 commits October 21, 2022 04:04

wmma_op + unit test

36c38ad

add arch limitation to wmma test

7dca846

change arch limitation

049cc8a

Refactor + Add all type unit test(int4 compile failed)

790e21e

Add f32_16x16x16_bf16 unit test

24faa1f

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

4fec5ad

…posable_kernel into wmma_op

Merge develop

ab66332

tempsave

98ccb36

tempsave

d16063d

tempsave

b3cc22a

runtime bug, cannot find symbol

9adf2e6

workaround for incorrect HIP warpSize return value

0cd587d

debugging

43a2099

tempsave

7395995

Correctness OK, waiting for optimization

9bd4468

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

289f15d

…posable_kernel into wmma_gemm

Tidy up + format

0a80872

temp save

9739ede

temp save, reproduce the v_bfi_b32 issue

e43df26

add inline asm for wmmaop test

13af8cc

tidy up

63f8766

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

b741109

…posable_kernel into wmma_gemm

clean some debug purpose code

2a0e543

discard some codes

3941bd1

clang format

cfb397b

clang format

5d5891b

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

40ec8e5

…posable_kernel into wmma_gemm

compiler issue fixed + increase tile size

8efd363

navi3x_multipleD+example

ccb94ce

temp save

2963dd9

aska-0096 added enhancement New feature or request urgency_high labels Jan 30, 2023

aska-0096 requested review from asroy, carlushuang and zjing14 January 30, 2023 10:39

aska-0096 self-assigned this Jan 30, 2023

aska-0096 requested a review from illsilin January 31, 2023 08:00

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

55a01ee

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

68ca5b3

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

aska-0096 added 2 commits February 9, 2023 03:50

Add arch limitation to all wmma examples

b47e8c4

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

e2dd8f0

…posable_kernel into PR567

aska-0096 added the CI - Pass label Feb 11, 2023

aska-0096 added 2 commits February 11, 2023 06:07

Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/com…

db8efd0

…posable_kernel into navi3x_mD_batchedGEMM_GroupConvFwd

fix bug: example30 input conv args

6eee660

illsilin approved these changes Feb 13, 2023

View reviewed changes

asroy approved these changes Feb 15, 2023

View reviewed changes

asroy merged commit 0cfda84 into develop Feb 15, 2023

illsilin deleted the navi3x_mD_batchedGEMM_GroupConvFwd branch December 14, 2023 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Navi3x] Add Device Operations#567

[Navi3x] Add Device Operations#567
asroy merged 43 commits into
developfrom
navi3x_mD_batchedGEMM_GroupConvFwd

aska-0096 commented Jan 30, 2023 •

edited

Loading

Uh oh!

asroy commented Jan 30, 2023

Uh oh!

illsilin commented Jan 30, 2023

Uh oh!

aska-0096 commented Jan 31, 2023 •

edited

Loading

Uh oh!

aska-0096 commented Feb 6, 2023

Uh oh!

illsilin commented Feb 7, 2023

Uh oh!

aska-0096 commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

aska-0096 commented Feb 9, 2023

Uh oh!

aska-0096 commented Feb 11, 2023

Uh oh!

aska-0096 commented Feb 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aska-0096 commented Jan 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Three Device Operations Added

Uh oh!

asroy commented Jan 30, 2023

Uh oh!

illsilin commented Jan 30, 2023

Uh oh!

aska-0096 commented Jan 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aska-0096 commented Feb 6, 2023

Uh oh!

illsilin commented Feb 7, 2023

Uh oh!

aska-0096 commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

illsilin commented Feb 8, 2023

Uh oh!

aska-0096 commented Feb 9, 2023

Uh oh!

aska-0096 commented Feb 11, 2023

Uh oh!

aska-0096 commented Feb 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aska-0096 commented Jan 30, 2023 •

edited

Loading

aska-0096 commented Jan 31, 2023 •

edited

Loading