Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Streamable Conformer-Transducer ASR model for LibriSpeech (#2140)
* Introduce DCT+DCConv logic * DDP fix? * Batch of changes and things brought back * Streaming fixes (successfully trains) * WIP streaming code * WIP functional streaming code * Fix left context * Fix formatting * Cleanups and docs in streaming utils * Better comment hparams, change seed back to orig, improve naming * uncomment averaging stuff; it was some ipython issue * Remove pin_memory as it was not beneficial * More cleanups, comments on context stuff * More comments and TODOs * encode_streaming docstring * Dirty TransducerBeamSearcher change for streaming GS * Fix precommit * Fix encoders that do not support chunk_size * Pre-commit again * Make chunk_size type consistent * Fix formatting of doctest in split_wav_lens * Remove outdated TODO * Add hasattr streaming to retain model backcompat * Cleanup doc and naming for transducer_greedy_decode * Cite paper for chunked attention * Remove lost comment * Update comment in self-attention * Don't apply masked fill fix in the non-bool mask case * Added TODO README update * Revert change to custom_tgt_module; patching model instead * Remove added entry in README * Fix streaming conformer conv mismatch * More conformer conv adjustments * Adjust context size * Remove outdated comment * Fixed causal conformer decoder * Fix linting * Gate `custom_tgt_module` creation behind the presence of decoder layers * Re-enable checkpoint averaging * Change averaged ckpt count to 10 * Add new model results to README * WIP refactor: Introduce DCTConfig dataclass * Improved notice in README * Formatting and linting fixes * Attempt at fixing circular import? * utils can't depend on core it seems; move dct * Whoops, missed file * Add DCT test, fix issues * Remove now obsolete yaml variables for streaming * Formatting * Add dummy dct_config parameter to keep unsupported encoders working * Linting fix * Fix typo * Add note on runtime autocast accuracy * Fix very bad typo from refactor in YAML * Fix hasattr streaming check * Remove legacy comment * Fix left context size calculation in new mask code * Fix causal models in TransformerASR * Remove comment on high-level inference code * YAML formatting + commenting dynchunktrain stuff * Remove outdated comment about DCConv left contexts * Remove commented out debug prints from TransformerASR * Move DCT into utils again * Rename all(?) mentions of DCT to explicit dynamic chunk training * Clarify padding logic * Remove now-useless _do_conv, fix horrible formatting * Slightly fix formatting further * Add docstrings to forward_streaming methods * Add a reference on Dynamic Chunk Training * Rework conformer docstring docs * Update conformer author list, fix doc formatting for authors * Fix trailing whitespace in conformer * Improved comments in Conformer.forward * Added random dynchunktrain sampler example * More explicit names for mask functions in TransformerASR * Added docstring example on encode_streaming * Pre-commit fix * Fix typo in conformer * Initial streaming integration test * Precommit fix * Fix indent in YAML * More consistent spelling in streaming integration test
- Loading branch information
Showing
14 changed files
with
1,812 additions
and
76 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.