Skip to content

Handle multiple embed nodes in transformer optimizer#4471

Merged
tianleiwu merged 3 commits intomasterfrom
tlwu/optimizer_multiple_embed_nodes
Jul 10, 2020
Merged

Handle multiple embed nodes in transformer optimizer#4471
tianleiwu merged 3 commits intomasterfrom
tlwu/optimizer_multiple_embed_nodes

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

Description: Describe your changes.
(1) Update fusion of embed layer to handle multiple embed nodes. Remove assumption that a model has only one EmbedLayerNormalization node.
(2) Fix temp model path in optimizer. Add a check that whether temp path is used.
(3) Add unit test for gpt2 fusion with past state and mask.
(4) Add unit test for change input to int32

Motivation and Context

  • Why is this change required? What problem does it solve?
  • If it fixes an open issue, please link to the issue here.

Some models from customer have multiple embed nodes. Current optimizer will throw assert exception (since we assumed there is only one embed per graph). This change make optimizer works on such model by removing such assumption.

Fix temp model path in optimizer
Add unit test for gpt2 fusion with past state and mask
Add unit test for change input to int32
@tianleiwu tianleiwu added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Jul 9, 2020
@tianleiwu tianleiwu requested review from liuziyue and yufenglee July 9, 2020 21:01
@tianleiwu tianleiwu requested a review from a team as a code owner July 9, 2020 21:01
Copy link
Copy Markdown
Contributor

@liuziyue liuziyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@tianleiwu tianleiwu merged commit e96a829 into master Jul 10, 2020
@tianleiwu tianleiwu deleted the tlwu/optimizer_multiple_embed_nodes branch July 10, 2020 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants