Gather to Slice Fusion by Lafi7e · Pull Request #13599 · microsoft/onnxruntime

Lafi7e · 2022-11-09T06:49:55Z

This PR is to optimize the running for below code from Huggingface's XLNet model.

x = torch.index_select(x, 3, torch.arange(klen, device=x.device, dtype=torch.long))

The code will be exported to Range->Gather, which can be fused to a Slice Op. Slice kernel is much faster than Gather, especially for backward run. The main reason is for Gather, the data in indices can be duplicated so that it needs sum during backward, but Slice node cannot have such case.

Use Huggingface's XLNet model for profiling.

Before the fuse
forward, ~753us

backward, ~46101us
After the fuse
forward, ~627us

backward, ~677us

baijumeswani · 2022-11-09T23:57:53Z

+
+/*
+Fuse Range->Gather to Slice.
+*/


Can you add a note here explaining that this fusion is primarily helpful for gradient computation for context.

Forward is also faster, but not that big. The model code also has below comment to show this for PyTorch. That's why I also put this fusion to the transformer utils file for inference.

# Note: the tensor-slice form was faster in my testing than torch.index_select # However, tracing doesn't like the nature of the slice, and if klen changes # during the run then it'll fail, whereas index_select will be fine. x = torch.index_select(x, 3, torch.arange(klen, device=x.device, dtype=torch.long))

baijumeswani

Just a small comment to add some context. You can get it in another PR if you like.

Looks good.

This PR is to optimize the running for below code from Huggingface's XLNet model. ``` x = torch.index_select(x, 3, torch.arange(klen, device=x.device, dtype=torch.long)) ``` The code will be exported to Range->Gather, which can be fused to a Slice Op. Slice kernel is much faster than Gather, especially for backward run. The main reason is for Gather, the data in indices can be duplicated so that it needs sum during backward, but Slice node cannot have such case. Use Huggingface's XLNet model for profiling. - Before the fuse forward, ~753us ![image](https://user-images.githubusercontent.com/11661208/200758439-63f2f9b5-9610-4df8-98c8-a1ad4dc62f4e.png) backward, ~46101us ![image](https://user-images.githubusercontent.com/11661208/200758530-fe16a8ec-ea8f-4b79-b3ac-386b72ba1670.png) - After the fuse forward, ~627us ![image](https://user-images.githubusercontent.com/11661208/200758654-ab9a6068-c45d-40f4-9c71-3862a56732f8.png) backward, ~677us ![image](https://user-images.githubusercontent.com/11661208/200758833-aab1b8e1-1b5d-4e55-88cf-03c2a1d9d42b.png)

gather to slice fusion

5018f44

Lafi7e added the training issues related to ONNX Runtime training; typically submitted using template label Nov 9, 2022

Lafi7e requested review from askhade and baijumeswani November 9, 2022 06:49

baijumeswani reviewed Nov 9, 2022

View reviewed changes

baijumeswani previously approved these changes Nov 9, 2022

View reviewed changes

Lafi7e added 2 commits November 10, 2022 08:56

Merge branch 'main' into weicwang/gather_to_slice

10c8658

add some comment

98d051b

Lafi7e dismissed baijumeswani’s stale review via 98d051b November 10, 2022 00:59

baijumeswani previously approved these changes Nov 10, 2022

View reviewed changes

fix typo

568a3dc

Lafi7e dismissed baijumeswani’s stale review via 568a3dc November 10, 2022 01:31

baijumeswani approved these changes Nov 10, 2022

View reviewed changes

Lafi7e merged commit 2bda3fd into main Nov 10, 2022

Lafi7e deleted the weicwang/gather_to_slice branch November 10, 2022 05:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gather to Slice Fusion#13599

Gather to Slice Fusion#13599
Lafi7e merged 4 commits into
mainfrom
weicwang/gather_to_slice

Lafi7e commented Nov 9, 2022

Uh oh!

baijumeswani Nov 9, 2022

Uh oh!

Lafi7e Nov 10, 2022

Uh oh!

baijumeswani left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Lafi7e commented Nov 9, 2022

Uh oh!

baijumeswani Nov 9, 2022

Choose a reason for hiding this comment

Uh oh!

Lafi7e Nov 10, 2022

Choose a reason for hiding this comment

Uh oh!

baijumeswani left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants