Skip to content

Commit

Permalink
sync after 6915 (#14)
Browse files Browse the repository at this point in the history
* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>

* fix default attention size (#7141) (#7143)

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* T5 metrics fix (#7037)

* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (#7022) (#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (#7019) (#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (#6994) (#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (#7052) (#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (#7065) (#7077)

* Fix documentation for Numba



* Update force float32 flag dynamically



* Update force float32 flag dynamically



* Fix nemo version



---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyar…
  • Loading branch information
Show file tree
Hide file tree
Showing 446 changed files with 27,299 additions and 4,477 deletions.
8 changes: 8 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,33 @@ ASR:
- examples/asr/**/*
- tutorials/asr/**/*
- docs/source/asr/**/*
- tests/collections/asr/**

NLP:
- nemo/collections/nlp/**/*
- examples/nlp/**/*
- tutorials/nlp/**/*
- docs/source/nlp/**/*
- tests/collections/nlp/**

Speaker Tasks:
- examples/speaker_tasks/**/*
- tutorials/speaker_tasks/**/*

TTS:
- nemo/collections/tts/**/*
- nemo/collections/common/tokenizers/text_to_speech/**
- examples/tts/**/*
- tutorials/tts/**/*
- docs/source/tts/**/*
- scripts/dataset_processing/tts/**
- scripts/tts_dataset_files/**
- tests/collections/tts/**
- tests/collections/common/tokenizers/text_to_speech/**

core:
- nemo/core/**/*
- tests/core/**

common:
- nemo/collections/common/**/*
Expand Down
20 changes: 14 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:23.06-py3
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:23.08-py3

# build an image that includes only the nemo dependencies, ensures that dependencies
# are included first for optimal caching, and useful for building a development
Expand Down Expand Up @@ -45,12 +45,18 @@ RUN apt-get update && \
WORKDIR /workspace/

WORKDIR /tmp/
# TODO: Remove once this Apex commit (5/12/23) is included in PyTorch
# container

# Distributed Adam support for multiple dtypes
RUN git clone https://github.com/NVIDIA/apex.git && \
cd apex && \
git checkout 8b7a1ff183741dd8f9b87e7bafd04cfde99cea28 && \
pip3 install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_layer_norm" --global-option="--distributed_adam" --global-option="--deprecated_fused_adam" ./
git checkout 52e18c894223800cb611682dce27d88050edf1de && \
pip3 install -v --no-build-isolation --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_layer_norm" --global-option="--distributed_adam" --global-option="--deprecated_fused_adam" ./

# install megatron core, this can be removed once 0.3 pip package is released
RUN git clone https://github.com/NVIDIA/Megatron-LM.git && \
cd Megatron-LM && \
git checkout ab0336a5c8eab77aa74ae604ba1e73decbf6d560 && \
pip install -e .

# uninstall stuff from base container
RUN pip3 uninstall -y sacrebleu torchtext
Expand All @@ -76,6 +82,8 @@ RUN for f in $(ls requirements*.txt); do pip3 install --disable-pip-version-chec
RUN pip install flash-attn
# pinned triton version for flash-attention https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py#L3
RUN pip install triton==2.0.0.dev20221202
# install numba for latest containers
RUN pip install numba>=0.57.1

# install k2, skip if installation fails
COPY scripts /tmp/nemo/scripts/
Expand All @@ -94,7 +102,7 @@ COPY . .

# start building the final container
FROM nemo-deps as nemo
ARG NEMO_VERSION=1.20.0
ARG NEMO_VERSION=1.21.0

# Check that NEMO_VERSION is set. Build will fail without this. Expose NEMO and base container
# version information as runtime environment variable for introspection purposes
Expand Down

0 comments on commit 74603f0

Please sign in to comment.