Fuse Attention For One Input bert-base-dynamic Model by liuziyue · Pull Request #3850 · microsoft/onnxruntime

liuziyue · 2020-05-06T21:35:06Z

Description:
Fuse attention node for one input bert models.

Motivation and Context

This change will help fuse attention node for better bert optimization.

tianleiwu · 2020-05-07T00:35:22Z

Please add test cases

liuziyue added 2 commits May 6, 2020 12:45

match additional mask path for attention

558ba71

comment

b8d1586

liuziyue requested a review from a team as a code owner May 6, 2020 21:35

liuziyue requested a review from tianleiwu May 6, 2020 21:49

build test

c07e3aa

add test case

a057468

tianleiwu approved these changes May 7, 2020

View reviewed changes

tianleiwu added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label May 7, 2020

liuziyue merged commit 914aaaa into master May 7, 2020

liuziyue deleted the ziyl/attention branch May 7, 2020 20:40

stevenlix mentioned this pull request May 12, 2020

Cherry pick PRs to release branch rel-1.3.0 #3911

Closed

Provide feedback