Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing the Performance of Decoder fl_asr_decode #709

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mtmd
Copy link
Contributor

@mtmd mtmd commented Aug 4, 2021

Original Issue: #707

Summary

This pull request improves the performance of the decoder (fl_asr_decode), targeting the transformer model, by offering the following optimizations:

  1. Adding support for AMP in the AM.
  2. Adding support for the batch processing in the AM.
  3. Offering an efficient CUDA kernel for PositionalEmbedding.

Test Plan (required)

Using the following flag file, the average decoding time on V100 can be improved by more that 4X using the optimizations that this pull request offers. The custom kernel for positional embedding may also improve the training performance. However, the impact is not measured.

--am=/path/to/am_transformer_ctc_stride3_letters_300Mparams.bin
--tokens=/path/to/tokens.txt
--lexicon=/path/to/lexicon.txt
--lm=/path/to/lm_common_crawl_small_4gram_prun0-6-15_200kvocab.bin
--datadir=/path/to/lists
--test=test-other.lst
--uselexicon=true
--decodertype=wrd
--lmweight=2
--wordscore=0
--beamsize=50
--beamthreshold=100
--nthread_decoder=16
--smearing=max
--show=false
--lmtype=kenlm
--lm_vocab=/path/to/lm_common_crawl_200kvocab.txt
--batchsize=24
--fl_optim_mode=O1
--fl_amp_use_mixed_precision=true

…ansformer model) by offering the following optimizations:

1. Adding support for AMP in the AM.
2. Adding support for the batch processing in the AM.
3. Offering an efficient CUDA kernel for PositionalEmbedding.
@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Aug 4, 2021
@jacobkahn jacobkahn self-requested a review August 5, 2021 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants