Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug in notebook #5382

Merged
merged 2 commits into from
Nov 10, 2022
Merged

Fixed bug in notebook #5382

merged 2 commits into from
Nov 10, 2022

Conversation

vadam5
Copy link
Contributor

@vadam5 vadam5 commented Nov 9, 2022

Signed-off-by: Virginia Adams vadams@nvidia.com

Fix bug in prompt learning notebook inference

Add a one line overview of what this PR aims to accomplish.

Collection: NLP

Changelog

  • Updated GPT prompt learning model class

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
  • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@vadam5 vadam5 merged commit f8f31a1 into r1.13.0 Nov 10, 2022
github-actions bot pushed a commit that referenced this pull request Nov 10, 2022
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
vadam5 added a commit that referenced this pull request Nov 16, 2022
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
tango4j pushed a commit that referenced this pull request Nov 17, 2022
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
nithinraok pushed a commit that referenced this pull request Nov 21, 2022
* first commit on eval_diar_with_asr.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Add a standalone diarization-ASR evaluation transcript

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed examples in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed staticmethod error

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* combine into 1 commit

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add MoE support for T5 model (w/o expert parallel) (#5409)

* clean

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* kwarg ref

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* extra args

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* rm prints

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* style

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (#5410) (#5416)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Fix for concat map dataset (#5133)

* change for concat map dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Exhaust longest dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: 1-800-BAD-CODE <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Add temporary fix for CUDA issue in Dockerfile (#5421) (#5422)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* Fix GPT generation when using sentencepiece tokenizer (#5413) (#5428)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431) (#5432)

This reverts commit 0718b17.

Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* [ITN] fix year date graph, cardinals extension for hundreds (#5435)

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add lociko's hundreds extension for cardinals

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add optional end

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update doc in terms of get_label for lang id model (#5366)

* reflect PR 5278 ion doc

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (#5420) (#5433)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed bug in notebook (#5382) (#5394)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* Fixing bug in Megatron BERT when loss mask is all zeros (#5424)

* Fixing bug when loss mask is fully zero

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update megatron_bert_model.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Use updated API for overlapping grad sync with pipeline parallelism (#5236)

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* support to disable sequence length + 1 input tokens for each sample in MegatronGPT (#5363)

* support to disable sequence length + 1 input tokens for MegatronGPT

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* [TTS] Create script for processing TTS training audio (#5262)

* Create script for processing TTS training audio
* Update VAD trimming logic
* Remove unused import

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] remove useless logic for set_tokenizer. (#5430)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix setting up of `ReduceLROnPlateau` learning rate scheduler (#5444)

* Fix tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add accidentally lost changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Create codeql.yml (#5445)

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix for getting tokenizer in character-based ASR models when using tarred dataset (#5442)

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

* Combine 5 commits

adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* moved eval_der function and fixed tqdm options

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed minor error in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* removed score_labels and changed leave=True

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
1-800-BAD-CODE added a commit to 1-800-BAD-CODE/NeMo that referenced this pull request Nov 26, 2022
* first commit on eval_diar_with_asr.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Add a standalone diarization-ASR evaluation transcript

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed examples in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed staticmethod error

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* combine into 1 commit

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409)

* clean

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* kwarg ref

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* extra args

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* rm prints

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* style

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410) (NVIDIA#5416)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Fix for concat map dataset (NVIDIA#5133)

* change for concat map dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Exhaust longest dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: 1-800-BAD-CODE <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432)

This reverts commit 0718b17.

Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435)

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add lociko's hundreds extension for cardinals

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add optional end

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update doc in terms of get_label for lang id model (NVIDIA#5366)

* reflect PR 5278 ion doc

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424)

* Fixing bug when loss mask is fully zero

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update megatron_bert_model.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236)

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363)

* support to disable sequence length + 1 input tokens for MegatronGPT

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* [TTS] Create script for processing TTS training audio (NVIDIA#5262)

* Create script for processing TTS training audio
* Update VAD trimming logic
* Remove unused import

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444)

* Fix tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add accidentally lost changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Create codeql.yml (NVIDIA#5445)

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442)

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

* Combine 5 commits

adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* moved eval_der function and fixed tqdm options

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed minor error in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* removed score_labels and changed leave=True

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
Signed-off-by: shane carroll <shane.carroll@utsa.edu>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* first commit on eval_diar_with_asr.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Add a standalone diarization-ASR evaluation transcript

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed examples in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed staticmethod error

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* combine into 1 commit

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409)

* clean

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* kwarg ref

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* extra args

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* rm prints

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* style

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410) (NVIDIA#5416)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Fix for concat map dataset (NVIDIA#5133)

* change for concat map dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Exhaust longest dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: 1-800-BAD-CODE <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432)

This reverts commit 0718b17.

Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435)

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add lociko's hundreds extension for cardinals

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add optional end

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update doc in terms of get_label for lang id model (NVIDIA#5366)

* reflect PR 5278 ion doc

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424)

* Fixing bug when loss mask is fully zero

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update megatron_bert_model.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236)

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363)

* support to disable sequence length + 1 input tokens for MegatronGPT

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* [TTS] Create script for processing TTS training audio (NVIDIA#5262)

* Create script for processing TTS training audio
* Update VAD trimming logic
* Remove unused import

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444)

* Fix tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add accidentally lost changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Create codeql.yml (NVIDIA#5445)

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442)

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

* Combine 5 commits

adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* moved eval_der function and fixed tqdm options

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed minor error in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* removed score_labels and changed leave=True

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* first commit on eval_diar_with_asr.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Add a standalone diarization-ASR evaluation transcript

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed examples in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed staticmethod error

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* combine into 1 commit

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409)

* clean

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* kwarg ref

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* extra args

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* rm prints

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* style

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410) (NVIDIA#5416)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Fix for concat map dataset (NVIDIA#5133)

* change for concat map dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Exhaust longest dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: 1-800-BAD-CODE <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432)

This reverts commit 0718b17.

Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435)

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add lociko's hundreds extension for cardinals

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add optional end

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update doc in terms of get_label for lang id model (NVIDIA#5366)

* reflect PR 5278 ion doc

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424)

* Fixing bug when loss mask is fully zero

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update megatron_bert_model.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236)

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363)

* support to disable sequence length + 1 input tokens for MegatronGPT

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* [TTS] Create script for processing TTS training audio (NVIDIA#5262)

* Create script for processing TTS training audio
* Update VAD trimming logic
* Remove unused import

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444)

* Fix tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add accidentally lost changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Create codeql.yml (NVIDIA#5445)

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442)

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

* Combine 5 commits

adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* moved eval_der function and fixed tqdm options

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed minor error in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* removed score_labels and changed leave=True

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
andrusenkoau pushed a commit to andrusenkoau/NeMo that referenced this pull request Jan 5, 2023
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
andrusenkoau pushed a commit to andrusenkoau/NeMo that referenced this pull request Jan 5, 2023
* first commit on eval_diar_with_asr.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Add a standalone diarization-ASR evaluation transcript

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed examples in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed staticmethod error

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* combine into 1 commit

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added description on eval modes

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409)

* clean

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* kwarg ref

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* extra args

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* test

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* rm prints

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* style

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* review comments

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410) (NVIDIA#5416)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Fix for concat map dataset (NVIDIA#5133)

* change for concat map dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Exhaust longest dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: 1-800-BAD-CODE <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432)

This reverts commit 0718b17.

Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>

* [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435)

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add lociko's hundreds extension for cardinals

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add optional end

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update doc in terms of get_label for lang id model (NVIDIA#5366)

* reflect PR 5278 ion doc

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424)

* Fixing bug when loss mask is fully zero

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update megatron_bert_model.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update dataset_utils.py

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236)

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363)

* support to disable sequence length + 1 input tokens for MegatronGPT

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* [TTS] Create script for processing TTS training audio (NVIDIA#5262)

* Create script for processing TTS training audio
* Update VAD trimming logic
* Remove unused import

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444)

* Fix tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add accidentally lost changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Create codeql.yml (NVIDIA#5445)

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442)

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>

* Combine 5 commits

adding diar_infer_general.yaml

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

Update codeql.yml

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

fix msdd_model in general yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

fixed errors in yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* moved eval_der function and fixed tqdm options

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed minor error in docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* removed score_labels and changed leave=True

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: Anmol Gupta <anmolg@nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
treacker added a commit that referenced this pull request Jan 25, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (#5277)

* Typo fix (#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (#5359)

* Add Gradio App to ASR Docs (#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (#5379)

* Fixed bug in notebook (#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (#5242) (#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339) (#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Kipok added a commit to Kipok/NeMo that referenced this pull request Jan 31, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (NVIDIA#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (NVIDIA#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (NVIDIA#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (NVIDIA#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (NVIDIA#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (NVIDIA#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (NVIDIA#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (NVIDIA#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (NVIDIA#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (NVIDIA#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (NVIDIA#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (NVIDIA#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (NVIDIA#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (NVIDIA#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (NVIDIA#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (NVIDIA#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (NVIDIA#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (NVIDIA#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (NVIDIA#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (NVIDIA#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (NVIDIA#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (NVIDIA#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (NVIDIA#5277)

* Typo fix (NVIDIA#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (NVIDIA#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (NVIDIA#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (NVIDIA#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (NVIDIA#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (NVIDIA#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (NVIDIA#5359)

* Add Gradio App to ASR Docs (NVIDIA#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (NVIDIA#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (NVIDIA#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (NVIDIA#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (NVIDIA#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (NVIDIA#5379)

* Fixed bug in notebook (NVIDIA#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (NVIDIA#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (NVIDIA#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (NVIDIA#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (NVIDIA#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (NVIDIA#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (NVIDIA#5242) (NVIDIA#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (NVIDIA#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (NVIDIA#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (NVIDIA#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (NVIDIA#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (NVIDIA#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) (NVIDIA#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (NVIDIA#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (NVIDIA#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (NVIDIA#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (NVIDIA#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (NVIDIA#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (NVIDIA#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (NVIDIA#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
ericharper added a commit that referenced this pull request Jan 31, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (#5277)

* Typo fix (#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (#5359)

* Add Gradio App to ASR Docs (#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (#5379)

* Fixed bug in notebook (#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (#5242) (#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339) (#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
ericharper added a commit that referenced this pull request Jan 31, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (#5277)

* Typo fix (#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (#5359)

* Add Gradio App to ASR Docs (#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (#5379)

* Fixed bug in notebook (#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (#5242) (#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339) (#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Kipok added a commit to Kipok/NeMo that referenced this pull request Jan 31, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (NVIDIA#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (NVIDIA#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (NVIDIA#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (NVIDIA#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (NVIDIA#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (NVIDIA#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (NVIDIA#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (NVIDIA#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (NVIDIA#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (NVIDIA#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (NVIDIA#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (NVIDIA#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (NVIDIA#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (NVIDIA#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (NVIDIA#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (NVIDIA#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (NVIDIA#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (NVIDIA#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (NVIDIA#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (NVIDIA#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (NVIDIA#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (NVIDIA#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (NVIDIA#5277)

* Typo fix (NVIDIA#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (NVIDIA#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (NVIDIA#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (NVIDIA#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (NVIDIA#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (NVIDIA#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (NVIDIA#5359)

* Add Gradio App to ASR Docs (NVIDIA#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (NVIDIA#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (NVIDIA#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (NVIDIA#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (NVIDIA#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (NVIDIA#5379)

* Fixed bug in notebook (NVIDIA#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (NVIDIA#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (NVIDIA#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (NVIDIA#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (NVIDIA#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (NVIDIA#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (NVIDIA#5242) (NVIDIA#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (NVIDIA#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (NVIDIA#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (NVIDIA#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (NVIDIA#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (NVIDIA#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) (NVIDIA#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (NVIDIA#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (NVIDIA#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (NVIDIA#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (NVIDIA#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (NVIDIA#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (NVIDIA#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (NVIDIA#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
@titu1994 titu1994 deleted the ptuning_notebook_fix branch February 23, 2023 01:57
titu1994 added a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

* new structure for tts datasets in script folder

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* remove cmudict downloading

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* rename mixertts dataset, add vocoder dataset

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add libritts processing

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* update tts dataset and libritts get data

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix bugs in vocoder ds

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add ds

* changed vits yaml

* rm yaml

* fix yaml and model

* Added scaler

* refactored yaml

* managed to run in fp16

* refactoring

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix small bugs and add new todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix optimizers

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* Port Variational Inference with Adversarial Learning (VITS) to NeMo TTS (NVIDIA#6)

* Add vits files

Add vits_losses.py, vits_modules.py and vits.py.

* Move non-vits models to modules

* Add vits.yaml

* Add _loader to vits.py

* Add basic template for vits

* Update vits.yaml with vits parameters

* Remove extra space

* Add top level training script

* Add some variables to vits yaml

* Add forward and training methods

* Fix imports

* Added validation step

* Log training losses

* Update loss calls to use class attributes

* Add VITS to models list

* Fix all imports

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Fix imports for VITS

* Remove old module calls

* Fix typo in monotonic align import

* Modified validation step

1. reverted to tensorboard
2. validation_step logs audio, mel-spec for batch 0
3. validation_step_alt logs audio, mel-spec for batch 0 and loss_mel

* Add parameters from original VITS config

* Fix config file

* Fix imports and generate spec from audio

* Fix incorrect dimensions

* Progress update

* Fix loss

* Fix cuda thing

* Fix monotonic align import

* Fix typos in vits.py

* Disable loss typecheck

* Fix spectrogram lengths

* Remove Precision 16 requirement

* Address lgtm alerts

* clean up unused code

* Address lgtm alerts

* Refactor audio_to_mel_torch method

* Use NeMo FilterBank to get melspec

Todo: set self.fb

* Fix filterbank max frequency to match with original VITS

* Fix filterbank features correct length

* Address lgtm issues

* Remove print statements

* Remove stft_pad_amount

Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* make new commit

Signed-off-by: Jason <jasoli@nvidia.com>

* add copyright headers

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* rename README

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* fix style without vits_modules

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* add numba code, fix style and add todos

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>

* small fix

* fix some todos

* added numba mas

* added DDP sampler

* specified versions

* fixed for new librosa version

* added feature loss

* added IPA phonemizer

* refactored IPA g2p

* added vits losses

* some ref

* fix

* added checkpointing

* cp

* cfg

* merged some 1.8.0 fixes

* plt fix

* fix logging

* fix checkpoint loading

* refactored inference

* fp32 run

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* new exp

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Restored tests previously disabled for 22.03 base (NVIDIA#4109)

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* add augmentation to label models (NVIDIA#4113)

* add augmentation to label models

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* duration fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Call register_bert_model after assigning self.bert_model variable (NVIDIA#4116)

Signed-off-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>

* Tutorial on ITN with Thutmose tagger and small fixes (NVIDIA#4117)

* 1. Add tutorial. 2. Move a function to fix import in tutorial. 3. Merge multiple spaces into one space in the final output

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fixes for code review

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* Add tutorial to tutorials.rst

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update the default (NVIDIA#4135)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (NVIDIA#4136)

* Fix restoring from checkpoint with label vocab dir

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add tests for various ways to pass label ids to model

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Do not create tmp directory

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix parameter name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* finish cherry-pick op

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix labels errors

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove duplicate stage

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Change target branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* fix typo (NVIDIA#4140)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix/punctuation avoid overwritting tmp files (NVIDIA#4144)

* Add draft of fixing tmp files overwritting

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove accidental changes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Use built-in tempfile library

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* bug_fix_diarization_manifest_creation (NVIDIA#4125)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* fix doc (NVIDIA#4146)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Tacotron2 retrain (NVIDIA#4103)

* fix yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fix for new TTSDataset class

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* added wandb logging

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* fix numpy version

Signed-off-by: treacker <emshabalin@yandex.ru>

* inference fix

Signed-off-by: treacker <emshabalin@yandex.ru>

* removed old code

Signed-off-by: treacker <emshabalin@yandex.ru>

* updated parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* reverted version update

Signed-off-by: treacker <emshabalin@yandex.ru>

* refactored parser logic

Signed-off-by: treacker <emshabalin@yandex.ru>

* Updated Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactored tutorial for Tacotron2

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Made backward compatibility

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update Jenkinsfile

Signed-off-by: treacker <emshabalin@yandex.ru>

* Update tacotron.yaml

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* cleaned up TN/ ITN doc (NVIDIA#4119)

* cleaned up TN/ ITN doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix typo

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix image

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Check implicit grad acc in GLUE dataset building (NVIDIA#4123)

* Check implicit grad acc in GLUE dataset building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix jenkins test for GLUE/XNLI

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Fixed jenkins

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

* Refactoring

Signed-off-by: treacker <emshabalin@yandex.ru>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Multiprocess improvements (NVIDIA#4127)

* initial commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* start fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* improve multiprocessing speed while creating speaker dataset

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated scp to filelist

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* WaveGlow input type fixes (NVIDIA#4151)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* notebooks' link, typo and import  fix  (NVIDIA#4158)

* redo missing pr 4007

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove extremely unreliable links

Signed-off-by: fayejf <fayejf07@gmail.com>

* Thutmose tagger bug fixes (NVIDIA#4162)

* add pretrained ngc model, small fixes

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix model location

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* 1. fix typos. 2. write magic functions without space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* add example of inference with pretrained model

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* changed model location to nemo

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* style fix

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

* fix space

Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>

Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>

* update speaker docs (NVIDIA#4164)

* update speaker docs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* chunks -> segments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Khz -> kHz

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* changed to vits g2p

* refactoring

* added cosineLR

* Updated whitelist path

* added vanilla torch grad scaler

* Fixed lightning version

* added warmup and wd

* switched to cosineLR

* refactored data classes for vits

* some fixes

* fixed import

* changeg train loop

* fixed scheduler bug

* refactoring for exps

* Refactored loss logic

* Ref for exps

* added coqui stuff

* exps

* bugfix

* added side file

* bugfix

* reverted

* fixed sampler behaviour

* updated for ptl 1.7.2

* refactored dataloader func

* some cleaning

* reverted to vanilla loss

* modified for pickling

* added dataset class

* fixed torch version

* added autocast for fp training

* removed coqui files

* 'Fixed tokenizer'

* Fix tokenizer

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix link to inference notebook (NVIDIA#5247)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update ASR scores table (NVIDIA#5254)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix links to speaker identification notebook (NVIDIA#5260)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Minor typo fixes in TTS tutorial (NVIDIA#5266)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Pcla tutorial fixes (NVIDIA#5271)

* Fixed typos

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed cell type and tatoeba reference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed typo

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fixed branch variable

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix bug into Dialogue tutorial (NVIDIA#5277)

* Typo fix (NVIDIA#5288)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix dialogue tutorial bug (NVIDIA#5297)

* set add_pooling_layer=False for huggingface bert model

* remove add_pooling_layer=False and set find_unused_parameters=True

* set num_prompt_tokens to 0 for huggingface

* small bugfix for r1.13.0 (NVIDIA#5310)

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* udpate transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add italian model checkpoints (NVIDIA#5316)

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (NVIDIA#5340)

* [STT] Add stt_ru_conformer_ctc_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [STT] Add stt_ru_conformer_transducer_large

Add stt_ru_conformer_transducer_large

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Pcla tutorial fixes (NVIDIA#5313)

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* fixes

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* moved `create_text_and_labels` to token_classification_utils.py

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* a lot of refactoring

* strict ptl version

* strict ptl version

* reverted plt version

* Added base text2audio class

* Fix issue with HF Model upload tutorial (NVIDIA#5359)

* Add Gradio App to ASR Docs (NVIDIA#5270)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
(cherry picked from commit e4b6a38)

* Fix issue with normalized config for dataset name

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* tutorial fixes (NVIDIA#5354)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Add SDP documentation (NVIDIA#5274)

* Add details to SDP README.md

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to WriteManifest processor

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add docstring to CreateInitialManifestMLS

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ModifyManifestTextProcessor docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add ASRInference docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add base_processor docstrings

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add minimal SDP docs page

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update tools/speech_dataset_processor/README.md

Co-authored-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* Write simple README for SDP and move complex explanations to docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove incorrect type hints

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make config example less confusing

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Clarify that YAML file is config file in README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove SDP docs for now

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove links to docs in SDP README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>

* [Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook to r1.13.0 (NVIDIA#5375)

* Fix minor error in notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

* changed branch name in tutorial notebook

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Rename Speech Dataset Processor to Speech Data Processor (NVIDIA#5378)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix for num worker 0 causing issues in losses after 1 epoch (NVIDIA#5379)

* Fixed bug in notebook (NVIDIA#5382)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Force MHA QKV onto fp32 (NVIDIA#5391)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

* Added scheduling variety

* ref

* Fix for prompt table restore error (NVIDIA#5393)

* Fix for prompt table restore error

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added more saftey checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added more condition checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix args (NVIDIA#5410)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* bugfix

* import tests

* Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)

Signed-off-by: Yu Yao <yuya@nvidia.com>

Signed-off-by: Yu Yao <yuya@nvidia.com>

* Megatron Export Update (NVIDIA#5343)

* export update for Megatron + change ORT optimization

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated export_utils to use autocast instead of manually casting >:/

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* removed dtype from LayerNorm

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added comment

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* reverting changes on FloatCast

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Cherry-picked changes from megatron-norm

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* updated asr_model import to cast_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* updated del onnx_model place

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* changed ort optimization to basic -> temp fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>

* disable pc test (NVIDIA#5426)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413)

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Disable sync_batch_comm in validation_step for GPT (NVIDIA#5397)

* disable sync_batch_comm in validation_step

Signed-off-by: ericharper <complex451@gmail.com>

* Read sync_batch_comm from config or default to False

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error

Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Comment out test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431)

This reverts commit 0718b17.

* Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420)

* Revert workers workaround

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix in config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Fixed discrepancies

* updated Jenkisfile

* updated Jenkisfile

* Cleaning

* fixed the onnx bug in conformer for non-streaming models. (NVIDIA#5242) (NVIDIA#5446)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>

* Set sync_batch_comm in other places (NVIDIA#5448)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Radtts 1.13 (NVIDIA#5451)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* [TTS] add CI test for RADTTS training recipe.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Radtts 1.13 plus (NVIDIA#5457)

* [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (NVIDIA#5358)
* Fixing RADTTS training - removing view buffer and fixing accuracy issue
* Fixes for Torchscript/Triton
* Added autocast to radtts UT
* using cuda() for training example

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add num layers check (NVIDIA#5470)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Change to kwargs (NVIDIA#5475)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) (NVIDIA#5478)

* Initial refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Resolve config before passing to load_from_checkpoint

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for model parallel and nemo restore

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert config changes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix validation reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove old comment

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes for test_ds

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* export_utils bugfix (NVIDIA#5480)

* updated export_utils

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Export fixes for Riva (NVIDIA#5496)

* Export fixes for Riva

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleaning up training_utils

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* minor bug fix (NVIDIA#5521)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* added set_start_method + function param bugfix (NVIDIA#5539)

* added set_start_method + function param bugfix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upper bound torchmetrics

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>

* remove notebook (NVIDIA#5548)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Remove broadcast (NVIDIA#5558)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* cleaning

* Fix all gather while writing to a file during T5 finetuning (NVIDIA#5561)

* Gather from data parallel only instead of all ranks

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added copyright

* fixed imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleaning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed filesize check

* last cleaning

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated cmudict path

* fixed merge bug

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* warnings fix

* fix warnings

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* storing

* updated version

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* update Jenkinsfile versions

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed issues

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* fixed more issues

* more fixes

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added experimental tag

* Clarification updates

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* remove old cython code

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Enhancements

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* excessive comtutations fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* typecheck fix

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Small refactoring

* Small refactoring

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* reversed exp_manager params

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* Fixed call for new function signature

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Yu Yao <yuya@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Co-authored-by: jasonjjl1999 <jasonjjl1999@gmail.com>
Co-authored-by: richa.ren@mail.utoronto.ca <richa.ren@mail.utoronto.ca>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: jasonjjl1999 <43978361+jasonjjl1999@users.noreply.github.com>
Co-authored-by: martynwei <martyn.wei@gmail.com>
Co-authored-by: Ryan Hong <66425733+rhong99@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ramanathan Arunachalam <ramanathan.arun@rutgers.edu>
Co-authored-by: Ramanathan Arunachalam <rarunachalam@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: Sasha Meister <117230141+ssh-meister@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants