-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update EncDecClassificationDatasetConfig Dataclass #1815
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
titu1994
approved these changes
Feb 26, 2021
okuchaiev
approved these changes
Feb 26, 2021
This pull request introduces 4 alerts when merging f0e045e into 0f9a772 - view on LGTM.com new alerts:
|
redoctopus
pushed a commit
that referenced
this pull request
Mar 11, 2021
* initial WIP of fs2 Signed-off-by: Jason <jasoli@nvidia.com> * segmentation tutorial dir fix (#1765) * fix dir Signed-off-by: ekmb <ebakhturina@nvidia.com> * dir fix for colab Signed-off-by: ekmb <ebakhturina@nvidia.com> * B4 leftovers (#1766) * Megatron fixes: lazy init moved back to module for inference to work (#1750) * Megatron fixes: lazy init moved back to module, Torch version bumped in Docker for ONNX Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Fixed indent Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Fixed checkpoint-dependent attr Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Format fix, extracted function Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Rolling back container version; Fixing hook reset Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Disabled ONNX unit test, kept Megatron forward test Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Restored lazy init calls from setup() Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Style fix Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * refactor lazy init Signed-off-by: ericharper <complex451@gmail.com> * style Signed-off-by: ericharper <complex451@gmail.com> Co-authored-by: ericharper <complex451@gmail.com> * Dev deps cnt (#1732) * added deps on new versions of packages Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com> * bumped version of EFF to 0.2.6, added nvidia-pypi to setup reqs Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com> * Using setup.py style fix to fix lack of space style in setup.py Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com> * removed graph surgeon Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com> * pinning version of webdataset to 0.1.40 Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com> * Cleaned up unused exports Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Removing extra requirements Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> Co-authored-by: ericharper <complex451@gmail.com> Co-authored-by: Tomasz Kornuta <56979727+tkornuta-nvidia@users.noreply.github.com> * update Dockerfile Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * add new length regulator Signed-off-by: Jason <jasoli@nvidia.com> * add new length regulator Signed-off-by: Jason <jasoli@nvidia.com> * Fix Primer notebook version and typo (#1773) Signed-off-by: smajumdar <titu1994@gmail.com> * use existing modules Signed-off-by: Jason <jasoli@nvidia.com> * use old modules Signed-off-by: Jason <jasoli@nvidia.com> * bug fixes Signed-off-by: Jason <jasoli@nvidia.com> * Tarred Datasets for Monolingual Corpora (#1758) * Initial commit for monolingual tarred dataset for NMT Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add coverage to BPE Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Initial working commit of monolingual tarred dataset Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Return beam search results when tgt is None in model forward Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Code formatting fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Added parallel dataset translation, detokenization and unused import fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * More style and unused import fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Allow setting topk value from CLI args Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Code formatting fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor creating monolingual and parallel datasets Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add batch translate function to NMT model, refactor dpp translate and monolingual webdataset fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> * durs wip Signed-off-by: Jason <jasoli@nvidia.com> * switch from spec to audio Signed-off-by: Jason <jasoli@nvidia.com> * Ja Source Language Preprocessing (#1781) * japanese preprocessing Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * removing m2m blob Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * fix SentencePieceTokenizer method usage Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * kwarg messup Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * re-order so that ja/zh/else is consistent. switch \' -> \" Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * fixing style Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * register sentencepiece_model so that it gets included in .nemo file when saved Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * remove commented out line Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Update notebooks to RC1 (#1782) * update model primer tutorial Signed-off-by: smajumdar <titu1994@gmail.com> * Update all notebooks to RC1 Signed-off-by: smajumdar <titu1994@gmail.com> * Update all notebooks to RC1 + README.rst Signed-off-by: smajumdar <titu1994@gmail.com> * Update docker instructions Signed-off-by: smajumdar <titu1994@gmail.com> * move the Q/DQ position for better fusion in TRT (#1783) Signed-off-by: Vincent Huang <vincenth@nvidia.com> * audio tb Signed-off-by: Jason <jasoli@nvidia.com> * didnt specify type of argument (#1785) Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Removing attach_onnx_to_onnx (#1790) * Removing attach_onnx_to_onnx Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * Removing onnx concatenation references Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> * handle aux sentencepiece tokenizer Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Notebook fix and modified some scripts (#1793) * Notebook fix and modified some scripts Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * added hi-mia script from earlier nemo 0.x versions Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add infer script; add prosody info to 2s; switch to log_dur Signed-off-by: Jason <jasoli@nvidia.com> * checkpoint name fix (#1798) Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add masking Signed-off-by: Jason <jasoli@nvidia.com> * Refactor of tokenization and detokenization within the NMT model (#1789) * Cleanup tokenization and detokenization Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes to merge moses and chinese/japanese call formats Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style fixes and remove methods not part of rc Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix circular imports Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix docstring Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused imports Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused import Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove sentencepiece tokenizer from within model class Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update nemo/collections/common/tokenizers/japanese_tokenizers.py Co-authored-by: Mike Chrzanowski <mike.chrzanowski0@gmail.com> * Add docstring for JapaneseTokenizer and rename variable Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Mike Chrzanowski <mike.chrzanowski0@gmail.com> * transpose Signed-off-by: Jason <jasoli@nvidia.com> * bug Signed-off-by: Jason <jasoli@nvidia.com> * add use own predictions; fix mask Signed-off-by: Jason <jasoli@nvidia.com> * Add Transcription script for all ASR models (#1786) * Add CTC transcription scripts Signed-off-by: smajumdar <titu1994@gmail.com> * Add speech transcription script Signed-off-by: smajumdar <titu1994@gmail.com> * Add speech transcription script Signed-off-by: smajumdar <titu1994@gmail.com> * Revert old changes Signed-off-by: smajumdar <titu1994@gmail.com> * Revert old changes Signed-off-by: smajumdar <titu1994@gmail.com> * Add jenkins test to run transcribe_speech.py Signed-off-by: smajumdar <titu1994@gmail.com> * Add missing apostrophe Signed-off-by: smajumdar <titu1994@gmail.com> * Correct duplicate stage name Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * Update jenkins Signed-off-by: smajumdar <titu1994@gmail.com> * temp remove gpu unittests Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Update Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * Give up on Jenkinsfile Signed-off-by: smajumdar <titu1994@gmail.com> * add new lightning trainer properties Signed-off-by: Jason <jasoli@nvidia.com> * share the same pre and post processing pipelines for Ja & En (#1801) * share the same pre and post processing pipelines for Ja & En Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * reorder for ordering Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * change comment for specificity Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * renamed file Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * styling fix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo styling fix. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * unnecessary multiple variable instantiation Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Bio meg dir update (#1796) * update dir name Signed-off-by: ekmb <ebakhturina@nvidia.com> * fix name Signed-off-by: ekmb <ebakhturina@nvidia.com> * proper use Signed-off-by: Jason <jasoli@nvidia.com> * Fix megatron vocab file (#1803) * use user specified vocab_file with megatron Signed-off-by: ericharper <complex451@gmail.com> * updating all examples Signed-off-by: ericharper <complex451@gmail.com> * Adding option to always create .nemo file when writing checkpoint (#1794) * Adding option to always create .nemo file when writing checkpoint Signed-off-by: rprenger <rprenger@nvidia.com> * Fixing an issue where save_best_model=True would have made the trainer start from the best model at every checkpoint save instead of from the latest model Signed-off-by: rprenger <rprenger@nvidia.com> * Caching the path of the best model so we don't re-generate .nemo files when they haven't changed Signed-off-by: rprenger <rprenger@nvidia.com> * style fix Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> Co-authored-by: rprenger <rprenger@nvidia.com> Co-authored-by: Jason <jasoli@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Hotfix for en/ ja preprocessing (#1804) * share the same pre and post processing pipelines for Ja & En Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * reorder for ordering Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * change comment for specificity Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * renamed file Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * styling fix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo styling fix. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * unnecessary multiple variable instantiation Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * hotfix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * share the same pre and post processing pipelines for Ja & En Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * reorder for ordering Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * change comment for specificity Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * renamed file Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * styling fix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo styling fix. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * unnecessary multiple variable instantiation Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * hotfix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * style fix agai Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * update Jenkins file Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * cfg.vocab_file was being updated to the .nemo location instead of the actual location (#1808) Signed-off-by: ericharper <complex451@gmail.com> * add proper ifs Signed-off-by: Jason <jasoli@nvidia.com> * set max seq length for inference (#1809) * update inference Signed-off-by: ekmb <ebakhturina@nvidia.com> * update Signed-off-by: ekmb <ebakhturina@nvidia.com> * make params explicit Signed-off-by: ekmb <ebakhturina@nvidia.com> * Applying PR 1794 to r1.0.0rc1 (#1812) * apply ryan's PR to r1.0.0rc1 * Update exp_manager.py add new line Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * diarization tutorial (#1814) Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * dataclass (#1815) Signed-off-by: Jason <jasoli@nvidia.com> * log pitch Signed-off-by: Jason <jasoli@nvidia.com> * Run all tests in RC Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * check if attr exist (#1817) Signed-off-by: ericharper <complex451@gmail.com> * Remove CTC parts from RNNT transcribe (#1816) Signed-off-by: smajumdar <titu1994@gmail.com> * Rc1 fix bert lm model 2 (#1818) * check if attr exist Signed-off-by: ericharper <complex451@gmail.com> * check if cfg.tokenizer is None Signed-off-by: ericharper <complex451@gmail.com> * Fix language filtering (#1791) Signed-off-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * HifiGAN finetuning on synthetic mels (#1780) * switched LR annealing to cosine, fixed type checks Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * added fine-tune dataset, added fine-tuning to hifigan training Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * added bias denoising Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * added yaml config specifications Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * fixed bot checks Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * max_steps exported to yaml Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> * switch to soundfile for audio loading, set max_steps instead of max_epochs in Trainer Signed-off-by: Felix Kreuk <felixkreuk@gmail.com> Co-authored-by: Jason <jasoli@nvidia.com> * Update exp_manager and callbacks for lightning 1.2.0 (#1774) * update exp_manager and callbacks for lightning 1.2.0 Signed-off-by: Jason <jasoli@nvidia.com> * add back filepath; remove global_rank and local_rank from ASR models Signed-off-by: Jason <jasoli@nvidia.com> * remove more global_rank local_rank Signed-off-by: Jason <jasoli@nvidia.com> * more bug fixes Signed-off-by: Jason <jasoli@nvidia.com> * del not pop Signed-off-by: Jason <jasoli@nvidia.com> * add open_dict Signed-off-by: Jason <jasoli@nvidia.com> * add properties Signed-off-by: Jason <jasoli@nvidia.com> * remove test for now Signed-off-by: Jason <jasoli@nvidia.com> * Add SPE tokenizer.vocab to registered archive (#1821) * Add SPE tokenizer.vocab to registered archives Signed-off-by: smajumdar <titu1994@gmail.com> * Add SPE tokenizer.vocab to registered archives Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for computing CTC / RNNT alignments (#1772) * Add logprob calculation support for RNNT (without batching) Signed-off-by: smajumdar <titu1994@gmail.com> * Add batched support for RNNT alignments Signed-off-by: smajumdar <titu1994@gmail.com> * Add docstring Signed-off-by: smajumdar <titu1994@gmail.com> * Add beam=1 decoding support for beam search logit preservation Signed-off-by: smajumdar <titu1994@gmail.com> * Update greedy alignment Signed-off-by: smajumdar <titu1994@gmail.com> * Alignments with beam search working Signed-off-by: smajumdar <titu1994@gmail.com> * Add docstring about computing alignments Signed-off-by: smajumdar <titu1994@gmail.com> * Add full alignment calculation support for ASR Models Signed-off-by: smajumdar <titu1994@gmail.com> * Add hypothesis output tests Signed-off-by: smajumdar <titu1994@gmail.com> * Correct documentation Signed-off-by: smajumdar <titu1994@gmail.com> * Update beam search doc Signed-off-by: smajumdar <titu1994@gmail.com> * Remove old code Signed-off-by: smajumdar <titu1994@gmail.com> * Update configs Signed-off-by: smajumdar <titu1994@gmail.com> * Fix for variable names in tarred dataset creation (#1827) * Fix for variable names in tarred dataset creation Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Another bug in filename variable Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Adding global and local rank that was removed for some reason Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove local/global rank again Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Patch ASR Notebooks (#1831) * Patch ASR Notebooks Signed-off-by: smajumdar <titu1994@gmail.com> * Patch ASR Notebooks Signed-off-by: smajumdar <titu1994@gmail.com> * Add SPE support for huge dataset corpus (#1822) * Add support for extremely large corpus fitting of SentencePicee Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for extremely large corpus fitting of SentencePicee Signed-off-by: smajumdar <titu1994@gmail.com> * Remove log message Signed-off-by: smajumdar <titu1994@gmail.com> * Remove log message Signed-off-by: smajumdar <titu1994@gmail.com> * Fixing NMT DDP override that no longer works with PTL 1.2 (#1829) * add world size attribute back to constructor Signed-off-by: ericharper <complex451@gmail.com> * replace r1.0.0rc1 with main in Jenkinsfile Signed-off-by: ericharper <complex451@gmail.com> * set find_unused_paramters to True by default for NLP models Signed-off-by: ericharper <complex451@gmail.com> * set find_unused_paramters to True by default for NLP models Signed-off-by: ericharper <complex451@gmail.com> * set find_unused_paramters to True by default for NLP models Signed-off-by: ericharper <complex451@gmail.com> * temporarily remove model parallel jenkins test Signed-off-by: ericharper <complex451@gmail.com> * check hasattr first Signed-off-by: ericharper <complex451@gmail.com> * check hasattr first Signed-off-by: ericharper <complex451@gmail.com> * check hasattr first Signed-off-by: ericharper <complex451@gmail.com> * overriding ddp plugin Signed-off-by: ericharper <complex451@gmail.com> * check if trainer is None Signed-off-by: ericharper <complex451@gmail.com> * add find_unused_parameters to accelerator attribute instead of connector Signed-off-by: ericharper <complex451@gmail.com> * move override to .setup Signed-off-by: ericharper <complex451@gmail.com> * use self.trainer instead of self._trainer Signed-off-by: ericharper <complex451@gmail.com> * set find_unused for non NLPModel Signed-off-by: ericharper <complex451@gmail.com> * style Signed-off-by: ericharper <complex451@gmail.com> * remove unused import Signed-off-by: ericharper <complex451@gmail.com> * Fix TTS Notebook bugs (#1837) * ix notebooks Signed-off-by: Jason <jasoli@nvidia.com> * reqs Signed-off-by: Jason <jasoli@nvidia.com> * wanb try catch Signed-off-by: Jason <jasoli@nvidia.com> * typo fixes (#1838) * punct tutorial fix Signed-off-by: ekmb <ebakhturina@nvidia.com> * typos fixed Signed-off-by: ekmb <ebakhturina@nvidia.com> * [WIP] Refactoring translation routines (#1805) * using batch_translate in eval_step Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * refactor + some fixes Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Fix chinese, japanese tokenizer imports breaking asr install Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor language specific tokenizers to implement tokenize,detokenize and normalize Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Bug fix in determining target processor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Few more fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Detokenization fix for EnJa Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments for finished work Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix enja decoding (#1820) * apply ryan's PR to r1.0.0rc1 Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * changes Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * * undo detokenization change * update all eval steps to tokenize ja/en correctly. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * bug Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo naming change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * get types right. annoying. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next changes. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * newline Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next round Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next round Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * final fix? Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * switching en<>ja pipeline to integers. which prevents coding issues. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * unnecessary logging import Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * remove unneeded detokenization file now that the sentencepieceprocessor does the detokenization Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Update mt_enc_dec_model.py remove line * more succinct * remove sentencepiecedetokenizer * remove ability to not specify a sentencepiecetokenizer path * comment * undo ptl 1.2.0 changes, which break draco training Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * fix style Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Move back to PTL 1.2 Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * undo changes to callbacks Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove global rank from model constructor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix enja decoding (round 2) (#1835) * apply ryan's PR to r1.0.0rc1 Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * changes Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * * undo detokenization change * update all eval steps to tokenize ja/en correctly. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * bug Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo naming change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * get types right. annoying. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next changes. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * newline Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next round Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * next round Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * final fix? Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * switching en<>ja pipeline to integers. which prevents coding issues. Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * unnecessary logging import Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * undo change Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * remove unneeded detokenization file now that the sentencepieceprocessor does the detokenization Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Update mt_enc_dec_model.py remove line * more succinct * remove sentencepiecedetokenizer * remove ability to not specify a sentencepiecetokenizer path * comment * undo ptl 1.2.0 changes, which break draco training * removing sentencepiece usage for en<>ja * few more * fix style Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * remove unused re imort Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * normalize only when lang is en Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * styline Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Undo lightning change Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused imports Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Mike Chrzanowski <mike.chrzanowski0@gmail.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * Fix text norm tutorial (#1836) * fix nlp typos in notebooks Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * fix tutorial for jupyter notebook Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * fix path name (#1840) Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * Limit the maximum length of subwords generated from corpus (#1842) * Add support for limiting length of subwords Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for limiting length of subwords Signed-off-by: smajumdar <titu1994@gmail.com> * Update docstring Signed-off-by: smajumdar <titu1994@gmail.com> * Update docstring Signed-off-by: smajumdar <titu1994@gmail.com> * grammar fix (#1843) Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * CI Fixes for Lightning 1.2.1 (#1839) * updates Signed-off-by: Jason <jasoli@nvidia.com> * add back pleasefixme Signed-off-by: Jason <jasoli@nvidia.com> * rmtree Signed-off-by: Jason <jasoli@nvidia.com> * add cleanup_local_folder fixtures instead of rmtree Signed-off-by: Jason <jasoli@nvidia.com> * bugfix? Signed-off-by: Jason <jasoli@nvidia.com> * add back pleasefixme Signed-off-by: Jason <jasoli@nvidia.com> * typo Signed-off-by: Jason <jasoli@nvidia.com> * set melGAN to find_unused = True Signed-off-by: Jason <jasoli@nvidia.com> * force deletion Signed-off-by: Jason <jasoli@nvidia.com> * fix 'DATA_DIR not found'. (#1846) Signed-off-by: Hoo Chang Shin <hshin@nvidia.com> Co-authored-by: Hoo Chang Shin <hshin@nvidia.com> * bug removed : onnx file was not getting added to the tarfile (.enemo) (#1832) * bug removed : **what was wrong** : renaming and adding onnx to tar was not working **How solved** : make atemp copy of the file rename and add to tar(.enemo) cleanup extra file Signed-off-by: supatel <supatel@gitlab-master.nvidia.com> * format fixed Signed-off-by: supatel <supatel@gitlab-master.nvidia.com> * code formatting with black Signed-off-by: supatel <supatel@gitlab-master.nvidia.com> * style Signed-off-by: Jason <jasoli@nvidia.com> Co-authored-by: supatel <supatel@gitlab-master.nvidia.com> Co-authored-by: Jason <jasoli@nvidia.com> * fix output path (#1845) Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * fix asr notebooks (#1847) Signed-off-by: fayejf <fayejf07@gmail.com> * Patch SPE tokenizer not being available in older ASR Checkpoints (#1848) Signed-off-by: smajumdar <titu1994@gmail.com> * bumping version to 1.0.0rc2 Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Try TTS Updates Again (#1849) * set find_unused Signed-off-by: Jason <jasoli@nvidia.com> * fix t2 header Signed-off-by: Jason <jasoli@nvidia.com> * add header Signed-off-by: Jason <jasoli@nvidia.com> * more fixes Signed-off-by: Jason <jasoli@nvidia.com> * update headers Signed-off-by: Jason <jasoli@nvidia.com> * headers Signed-off-by: Jason <jasoli@nvidia.com> * undo tacotron2 change Signed-off-by: Jason <jasoli@nvidia.com> * Cleanup save/restore (#1851) * Cleanup save/restore * Remove EFF save/restore routes * Once we can take EFF dependency we will use EFF.Archive directly Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * fix copyright headers Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> * Freeze modules during transcribe to prevent gradient accumulation during loop (#1853) Signed-off-by: smajumdar <titu1994@gmail.com> * ASR with Speaker Diarization noteboook (#1850) * ASR with Speaker Diarization noteboook Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * changed format to speaker first Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * wording corrections Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Update README.rst README is pointing to a container that hasn't been released yet. * Fix qa tutorial (#1860) * fix output path Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * fix Signed-off-by: Yang Zhang <yangzhang@nvidia.com> * remote hard fix path (#1862) Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * EnJa tokenize output format fix (#1863) * tookenization fix Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * better naming of output variable Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * revert changes and fix enja tokenize func Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> * revert chinese changes. break 1-liner into 2 in enja Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * sync val metrics (#1861) Signed-off-by: ericharper <complex451@gmail.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * RC1 NeMo Core Docs Update (#1858) * update docs Signed-off-by: ericharper <complex451@gmail.com> * update docs Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * update Signed-off-by: ericharper <complex451@gmail.com> * switch to stft_patch (#1864) Signed-off-by: Jason <jasoli@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Update CI container to 21.02 (#1865) * Update CI container to 21.02 Signed-off-by: smajumdar <titu1994@gmail.com> * Correct ffmpeg install Signed-off-by: smajumdar <titu1994@gmail.com> * update squad inference to use correct gpu if list of gpus is passed Signed-off-by: ericharper <complex451@gmail.com> * trainer.test seems to be working properly with ddp now Signed-off-by: ericharper <complex451@gmail.com> Co-authored-by: ericharper <complex451@gmail.com> * Some renaming. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * TalkNet 1.x draft. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * Three pipelines complete. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * Fix some comments. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * Fix style issues. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * TTS Notebook and PR issues. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * TalkNet style issues. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * TalkNet doc. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * Small fix. Signed-off-by: Stanislav Beliaev <stasbelyaev96@gmail.com> * round Signed-off-by: Jason <jasoli@nvidia.com> * pin lightning Signed-off-by: Jason <jasoli@nvidia.com> * print more torch stuff Signed-off-by: Jason <jasoli@nvidia.com> * wip of adding talknet durations Signed-off-by: Jason <jasoli@nvidia.com> * wip Signed-off-by: Jason <jasoli@nvidia.com> * training working Signed-off-by: Jason <jasoli@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com> Co-authored-by: ericharper <complex451@gmail.com> Co-authored-by: Tomasz Kornuta <56979727+tkornuta-nvidia@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Mike Chrzanowski <mike.chrzanowski0@gmail.com> Co-authored-by: Mike Chrzanowski <mchrzanowski@nvidia.com> Co-authored-by: Xiaodong (Vincent) Huang <vincenth@nvidia.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Ryan Prenger <ryanprenger@baidu.com> Co-authored-by: rprenger <rprenger@nvidia.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Felix Kreuk <felixkreuk@gmail.com> Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com> Co-authored-by: khcs <khcs@users.noreply.github.com> Co-authored-by: Hoo Chang Shin <hshin@nvidia.com> Co-authored-by: SUNIL PATEL <snlpatel001213@hotmail.com> Co-authored-by: supatel <supatel@gitlab-master.nvidia.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Stanislav Beliaev <stasbelyaev96@gmail.com>
mousebaiker
pushed a commit
to mousebaiker/NeMo
that referenced
this pull request
Jul 8, 2021
Signed-off-by: Jason <jasoli@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.