Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Allow moe.py to work with a tensor of "memory_length" dimension - Fix Experts Attention bug in moe.py where it would break during decoding if the input dimension was different than the output dimension. - Fix bug in ExpertsEncDecAttention where it was only doing Self-Attention on the decoder side. - Factorize expert_computation code to easily allow for using different query and memory antecedents PiperOrigin-RevId: 390808726
- Loading branch information
Mesh TensorFlow Team
committed
Sep 7, 2021
1 parent
21c4ef3
commit 2456b1b
Showing
4 changed files
with
99 additions
and
48 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters