Add Initial support for ContextNet Encoder and CTC Decoder#630
Add Initial support for ContextNet Encoder and CTC Decoder#630titu1994 merged 19 commits intoNVIDIA-NeMo:masterfrom
Conversation
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
| logging = nemo.logging | ||
|
|
||
|
|
||
| class ContextNetEncoder(TrainableNM): |
There was a problem hiding this comment.
Should this inherit from JasperEncoder ?
There was a problem hiding this comment.
On second thought, it probably should not inherit JasperEncoder. While yes currently they share exactly same functionality, in the future they will not. In that case, the __init__ call will instantiate multiple JasperBlocks before ContextNetEncoder starts to instantiate its own values.
While there is duplication for now, it is cleaner to separate the two modules
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
|
This pull request introduces 1 alert when merging 8c81303 into a22d325 - view on LGTM.com new alerts:
|
Signed-off-by: smajumdar <titu1994@gmail.com>
|
|
||
| # (ContextNet uses the Jasper baseline encoder and decoder) | ||
| encoder = nemo_asr.ContextNetEncoder( | ||
| feat_in=contextnet_params["AudioToMelSpectrogramPreprocessor"]["features"], |
There was a problem hiding this comment.
Just a note that you can add this inside the yaml itself.
See https://confluence.atlassian.com/bitbucket/yaml-anchors-960154027.html
There was a problem hiding this comment.
Thanks for the hint !
Signed-off-by: smajumdar <titu1994@gmail.com>
|
This pull request introduces 1 alert when merging 81330ba into a22d325 - view on LGTM.com new alerts:
|
Signed-off-by: smajumdar <titu1994@gmail.com>
…Mo#630) * Add SE + context SE support Signed-off-by: smajumdar <titu1994@gmail.com> * Add contextnet components Signed-off-by: smajumdar <titu1994@gmail.com> * Add ContextNet support Signed-off-by: smajumdar <titu1994@gmail.com> * Add config files Signed-off-by: smajumdar <titu1994@gmail.com> * Correct configs Signed-off-by: smajumdar <titu1994@gmail.com> * Add streaming speech command Signed-off-by: smajumdar <titu1994@gmail.com> * Add kernel size factor argument Signed-off-by: smajumdar <titu1994@gmail.com> * Add docstrings Signed-off-by: smajumdar <titu1994@gmail.com> * Update CHANGELOG.md Signed-off-by: smajumdar <titu1994@gmail.com> * Add integration tests Signed-off-by: smajumdar <titu1994@gmail.com> * Style fixes and add docstrings for se_reduction_ratio Signed-off-by: smajumdar <titu1994@gmail.com> * Style fixes in tests Signed-off-by: smajumdar <titu1994@gmail.com> * Correct CHANGELOG.md Signed-off-by: smajumdar <titu1994@gmail.com> * Correctios to docstrings Signed-off-by: smajumdar <titu1994@gmail.com> * Add WandB support to contextnet.py Signed-off-by: smajumdar <titu1994@gmail.com> * Style fixes Signed-off-by: smajumdar <titu1994@gmail.com> * Remove unused import Signed-off-by: smajumdar <titu1994@gmail.com> * Refactor ContextNetEncoder to subclass JasperEncoder Signed-off-by: smajumdar <titu1994@gmail.com> * Remove unused imports Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: ZeroCool <alejandrogilelias940711@gmail.com>
Use a single jinja template for the prompts with and without a document. Also remove the conditionals checking for te presence of a document. Fixes NVIDIA-NeMo#629 Signed-off-by: Derek Higgins <derekh@redhat.com>
Changelog
Added
stride_lastflag which allowsstrideandrepeatflags to be used simultaneously. It will perform the strided convolution at the final Conv-BN-ReLU sub-block.swishas optional activation functionzero_infinityflag toCTCLoss, default to False.Modified
se_reduction_ratioto 8 instead of 16.SpecAugmentnow supports either an integer or floating point value fortime_width.Note: Currently,
examples/asr/contextnet.pyuses JasperDecoderForCTC instead of ContextNetDecoderForCTC. This will be updated in a future PR once full support is present.