[Relay, TOPI] Make Softmax op fusible with elemwise ops #8909

masahi · 2021-09-02T09:12:15Z

Currently, op fusion is not enabled for softmax op. This has been fine for imagenet models where softmax is only used at the end. But transformer models use a lot of softmax in the middle.

When the FP16 conversion is applied, softmax is always left fp32, so there are always a lot of cast(softmax_output, dtype="float16") after the conversion. Since softmax and cast cannot be fused, we end up a lot of inefficient "cast only" kernels: https://gist.github.com/masahi/0d7d96ae88722b616a906cec2054559e#file-transformer-txt-L37-L43

This PR changes softmax op's fuse pattern, so that it can be fused with following elemwise / injective ops, just like conv2d etc. Topi schedules have also been updated to take fused ops into account.

cc @yzhliu @comaniac @AndrewZhaoLuo @mbrookhart

python/tvm/topi/cuda/softmax.py

comaniac

Overall LGTM.

python/tvm/topi/cuda/softmax.py

masahi · 2021-09-03T08:53:21Z

~~Completely blocked by flaky micro tvm tests, the QEMU job failed five times in a row.~~
My change did break one of microtvm tests...

…y storage_rewrite

…reused by storage_rewrite" This reverts commit 8aa340e.

* Change softmax op pattern to OUT_ELEMWISE_FUSABLE * Softmax is fused but x86 schedule is suboptimal * fusion properly done * Updating GPU schedule for fusion * update softmax warp shuffle schedule * fix compute_at * Bug fix in lower_thread_all_reduce when reduction storage is reused by storage_rewrite * Temp disable softmax warp reduction schedule when softmax is fused * Revert "Bug fix in lower_thread_all_reduce when reduction storage is reused by storage_rewrite" This reverts commit 8aa340e. * lint fix * try make diff smaller * fix tests * fixed another broken test * Fix flaky uTVM templating test * fix equality check on output op Co-authored-by: masa <masa@pop-os.localdomain> Co-authored-by: Gavin Uberti <gavin.uberti@gmail.com>

masahi requested review from anijain2305, areusch, comaniac, Huyuwei, jcf94, jroesch, junrushao, jwfromm, kevinthesun, Laurawly, MarisaKirisame, mbrookhart, merrymercy, slyubomirsky, tqchen, vinx13, wweic, yzhliu, zhiics and ZihengJiang as code owners September 2, 2021 09:12

masahi commented Sep 2, 2021

View reviewed changes

python/tvm/topi/cuda/softmax.py Show resolved Hide resolved

comaniac approved these changes Sep 2, 2021

View reviewed changes

python/tvm/topi/cuda/softmax.py Show resolved Hide resolved

masahi force-pushed the softmax-fuse branch 4 times, most recently from ecb05db to c4dd35b Compare September 3, 2021 08:35

masahi force-pushed the softmax-fuse branch 2 times, most recently from 6afd482 to 89c947b Compare September 3, 2021 10:53

Change softmax op pattern to OUT_ELEMWISE_FUSABLE

3f73c06

masahi and others added 14 commits September 5, 2021 14:27

Softmax is fused but x86 schedule is suboptimal

c83f3b3

fusion properly done

6200b53

Updating GPU schedule for fusion

3718c0e

update softmax warp shuffle schedule

e704c93

fix compute_at

6df9601

Bug fix in lower_thread_all_reduce when reduction storage is reused b…

ac504a4

…y storage_rewrite

Temp disable softmax warp reduction schedule when softmax is fused

5210f45

Revert "Bug fix in lower_thread_all_reduce when reduction storage is …

9615acd

…reused by storage_rewrite" This reverts commit 8aa340e.

lint fix

53192e7

try make diff smaller

fa5a495

fix tests

95c2d2f

fixed another broken test

eb870b5

Fix flaky uTVM templating test

8ce48ce

fix equality check on output op

8689b18

masahi force-pushed the softmax-fuse branch from f290ed7 to 8689b18 Compare September 5, 2021 05:27

masahi merged commit 7eda4a5 into apache:main Sep 6, 2021

masahi mentioned this pull request Sep 7, 2021

[TIR] Fixed LowerThreadallreduce not remapping Store buffer var #8931

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay, TOPI] Make Softmax op fusible with elemwise ops #8909

[Relay, TOPI] Make Softmax op fusible with elemwise ops #8909

masahi commented Sep 2, 2021 •

edited

Loading

comaniac left a comment

masahi commented Sep 3, 2021 •

edited

Loading

[Relay, TOPI] Make Softmax op fusible with elemwise ops #8909

[Relay, TOPI] Make Softmax op fusible with elemwise ops #8909

Conversation

masahi commented Sep 2, 2021 • edited Loading

comaniac left a comment

Choose a reason for hiding this comment

masahi commented Sep 3, 2021 • edited Loading

masahi commented Sep 2, 2021 •

edited

Loading

masahi commented Sep 3, 2021 •

edited

Loading