Skip to content

Commit

Permalink
ONNX export for RadTTS (#5880)
Browse files Browse the repository at this point in the history
* Megatron positional encoding alibi fix (#5808) (#5863)

* 1. Debugging.

* 1. Debugging.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* 1. Debugging.

* 1. Fixed initialization.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Debugging.

* 1. Debugging.

* 1. Debugging.

* 1. Debugging.

* 1. Debugging.

* 1. Debugging.

* 1. Debugging.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* 1. Removed scale from ALiBi.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated yaml and added support to control number of alibi heads.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Removed num_attention_heads_alibi from configs.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>

Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix segmenting for pcla inference (#5849)

* Fix segmenting for pcla inference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* Fix segmenting for pcla inference

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* indentation fix (#5861) (#5862)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* add ambernet to readme (#5872) (#5873)

Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>

Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>

Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix wrong label mapping in batch_inference for label_model (#5767) (#5870)

* fix batch inference

* add test for batch

* fix device

Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* WAR for https://github.com/pytorch/pytorch/pull/91526

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix memory allocation of NeMo Multi-speaker Data Simulator (#5864)

* fix data simulator

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* Adding noise_manifest handling for faster speed

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added multi-gpu feature

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added a parameter for noise source file number

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed noise_manifest error bug

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* RETRO model finetuning (#5800)

* add save and load dynmaic index

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add chunk stride feature

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add chunk stride feature

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add no pq index

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added megatron lm compatible mode

Signed-off-by: Yi Dong <yidong@nvidia.com>

* addd config

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix position embedding

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added index factory

Signed-off-by: Yi Dong <yidong@nvidia.com>

* share neighbors and weights amoung strategies

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix bug

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added metric tto faiss index

Signed-off-by: Yi Dong <yidong@nvidia.com>

* set default to inner product

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added qa fine tuen dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added fine tuning code

Signed-off-by: Yi Dong <yidong@nvidia.com>

* trim it

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix data issue

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added version

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix key error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* make sure to overwrite the cfg

Signed-off-by: Yi Dong <yidong@nvidia.com>

* make multiple sentence bert available

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix the document

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix the table

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix transformer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* make sure to turn off the rope in chunked cross attention layer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix the security issue

Signed-off-by: Yi Dong <yidong@nvidia.com>

* style fix

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix codeql issues

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix

Signed-off-by: Yi Dong <yidong@nvidia.com>

* use -1

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix empty index

Signed-off-by: Yi Dong <yidong@nvidia.com>

* clean up

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix the lower bound for repetition penalty

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add retro qa inference strategy

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added new inference logic

Signed-off-by: Yi Dong <yidong@nvidia.com>

* working inference

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix TP inference

Signed-off-by: Yi Dong <yidong@nvidia.com>

* revert requirement

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added file inference

Signed-off-by: Yi Dong <yidong@nvidia.com>

* use string to prevent collison

Signed-off-by: Yi Dong <yidong@nvidia.com>

* use NQ test

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix prompt

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix inference

Signed-off-by: Yi Dong <yidong@nvidia.com>

* set good defaults for demo

Signed-off-by: Yi Dong <yidong@nvidia.com>

* replicate adlr

Signed-off-by: Yi Dong <yidong@nvidia.com>

* make sure to turn off attention reset for megatron lm compatible model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* style fix

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix typo

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix inference error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix logging

Signed-off-by: Yi Dong <yidong@nvidia.com>

* address comments

Signed-off-by: Yi Dong <yidong@nvidia.com>

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [TTS] GAN-based spectrogram enhancer (#5565)

* [TTS] add SpectrogramEnhancer based on StyleGAN 2

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] some tests for spectrogram enhancer

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: a tiny clean up

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: log images during training

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* exp_manager: pass save_on_train_epoch_end to checkpointing callback

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: add training script and config examples

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix comments

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: don't assume FastPitch

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: better input shapes handling

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix porting error

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix logging and .nemo saving

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: clean up scaling

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: formatting

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: update examples

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: shape handling

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove LoggerCollection handling

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: copyright notice for tests

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: use process_batch helper

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: return empty list of available models

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: some docs

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: style --fix

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: chan_last -> channel_last

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove unused imports

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove unused return value

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: losses are nn.Modules now

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: init optimizers from config

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: formatting

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: unused imports

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: typechecking

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: more tests

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix logging images

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: unclutter prepare_batch

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: init generator and discriminator from the config for consistency with other NeMo models

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: update spectrogram range in the example config

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: comment on loss weights in the example config

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: rename Conv2DMod to Conv2DModulated

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove unused imports

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix CodeQL import warnings

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: type_as_recursive -> to_device_recursive

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: move to_device_recursive to helpers

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: move losses to a separate module, add comments

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: add optimizers' entries to config

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix test configs

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: support length masking for 3-dim tensors

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: add masking to spectrogram normalization

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix tests

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: add spectrogram normalization tests

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix imports and formatting in tests

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix docstring typo

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: rename G and D to generator and discriminator

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: better argument naming in interfaces (condition -> input_spectograms, target -> target_spectrograms)

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: formatting

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [TTS] SpectrogramEnhancer: fix import warnings in modules

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] add resynthesize_dataset.py script

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] add PairedRealFakeSpectrogramsDataset

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: update example config to reflect new data setup

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] resynthesize_dataset.py: remove unused imports

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] resynthesize_dataset.py: use nemo manifest handling

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] resynthesize_dataset.py: remove unused import

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] resynthesize_dataset.py: underscores for .npy names

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove return value from a test

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] add length masking helper

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: use common tts length mask function

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] unused imports in tts helpers

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: fix an import

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: introduce computed upsample_factor to generator

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: clean up and clarify validation data setup

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: remove a hardcoded path in the example config

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] SpectrogramEnhancer: configurize max_spectrogram_length in generator

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] resynthesize_dataset.py: consistent dashes and underscores in CLI args

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
Signed-off-by: Roman Korostik <racoiaws@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Optimizing distributed Adam when running with one work queue (#5560)

* Dist Adam constructs a single param bucket for each GPT layer

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Synchronize dist Adam reduce-scatters before launching model-parallel all-reduces

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Configure per-layer dist Adam buckets for BERT and T5

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Remove unused variables

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Configure GPT with one dist Adam bucket per virtual pipeline stage

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Configure BERT with one dist Adam bucket per virtual pipeline stage

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit in Dockerfile

Need recent updates to Apex distributed Adam optimizer.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Remove logic for per-virtual-pipeline distopt buckets from T5

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* fix(readme): fix typo (#5883)

Signed-off-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org>
Signed-off-by: Jason <jasoli@nvidia.com>

* TTS inference with Heteronym classification model, hc model inference refactoring (#5768)

* refactor inference, fix span detection

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix merge conflicts

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix merge conflicts

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove unused var

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up, test update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* arg name update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert changes

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update docs, move heteronym to baseg2p

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* change wordid file defaults to none

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add manifest check

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* replace homograph with heteronym, upper case wordid for riva, review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add log message, update comment

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* rename test manifest field

Signed-off-by: ekmb <ebakhturina@nvidia.com>

---------

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* take out retro doc (#5885) (#5886)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Add option to disable distributed parameters in distributed Adam optimizer (#5685)

* Add option to run dist Adam without distributed params

Similar to DDP, but leverages dist Adam's support for overlapping communication with backward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping when dist Adam has redundant params

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [ASR] Separate Audio-to-Text (BPE, Char) dataset construction (#5774)

* Separate full BPE dataset construction

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix the case when the dataset is None

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix comment

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix typos

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Separate char dataset construction. Fix DALI dataset usage.

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* transformer duration added and IPA config files added

Signed-off-by: Jason <jasoli@nvidia.com>

* inference issue for pace resolved

Signed-off-by: Jason <jasoli@nvidia.com>

* Latest ONNX develpoments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Remove MCD_DTW tarball (#5889)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Block large files from being merged into NeMo main (#5898)

* Attempt to use large-file pre-commit ci hook

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Set defaults and enforce

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Set to 1000

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Remove enforcement

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

---------

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Reduce memory usage in getMultiScaleCosAffinityMatrix function (#5876)

* Updated offline_clustering.py, the getMultiScaleCosAffinityMatrix function, reduced memory usage

Signed-off-by: gabitza-tech <gabriel.pirlogeanu@gmail.com>

* torch.empty.cache() outside forward_infer()

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Removed unnecessary lines

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Speed up for non torch.jit.script

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* parallelism is default off

Signed-off-by: Taejin Park <tango4j@gmail.com>

* nme_mat_size is unified as 512, removing redundant docstring

Signed-off-by: Taejin Park <tango4j@gmail.com>

---------

Signed-off-by: gabitza-tech <gabriel.pirlogeanu@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* set max_steps for lr decay through config (#5780)

* set max_steps for lr decay through config

* added warning for optim sched max_steps config option

* reverted changes to modelPT and updated megatron_base_model

* added the experimental cosine annealing scheduler class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update decay_steps for consine annealing exp class

* added copyright

---------

Co-authored-by: ANMOL GUPTA <anmolg@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix transducer and question answering tutorial bugs bugs (#5809) (#5810)

Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* update apex install instructions (#5901) (#5902)

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Hybrid ASR-TTS models (#5659)

Add hybrid ASR-TTS models and text-to-text dataset

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Set providers for ORT inference session (#5903)

Signed-off-by: athitten <abhishreetm@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [ASR] Configurable metrics for audio-to-audio + removed experimental decorators (#5827)

* Added an option to configure metrics for audio-to-audio models
Removed experimental decorators

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Addressed review comments

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

---------

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Correct doc for RNNT transcribe() function (#5904)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Add segmentation export to Audacity label file (#5857)

* Save the segmentation as label file for Audacity

Audacity is a free open source audio editor that can import label file to quickly assess the segmentation quality. This commit add the export to [Audacity label format](https://manual.audacityteam.org/man/importing_and_exporting_labels.html) so that directly after running the segmentation tool the segmentation quality can be assessed or the segmentation can be shared easily.

Signed-off-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>

* Fix styling

Signed-off-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused score in audacity export

score is not written in audacity label file so we can safely not load it from segment.

Signed-off-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>

---------

Signed-off-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Cross-Lingual objectives (XLM) and multilingual (many-many) support for Megatron-NMT (#5026)

* Update blendable dataset, and refactor seq2seq data

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Blendable dataset with binarized mmap working

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Pass seed from cfg to dataset

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix multilingual setup

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add on epoch start reconfiguration

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update tokenizer creation for multilingual

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Tmp

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update NMT script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove unused import

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update training script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Log consumed samples

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Logging on val epoch end

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove redundant print

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Ckpt averaging for non model parallel megatron models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update error message

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Restore fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove ipdb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Move to classmethods

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Initial

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* Refactor masking to add skip_masking_id and working xlm bert and t5 datasets

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Testing a simple solution

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed. Seems to work. Need to validate.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Added support in CSV and text memmap toMEgatron encoder-decoder

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Added support in CSV.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.
2. Fixed bugs.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed bugs.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Updated yaml.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* Minor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* 1. Fixed warnings.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed a bug.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* Tmp

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updates

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix minor data things

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Lang ids for validation datasets

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes for lang id code at inference

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove pdb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix prepend ID and bleu logging

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for many-many NMT

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Reset o2 default

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Restore dataset utils

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Allreduce bleu scores

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* 1. Loading index file into memmap object.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed extentin when loading files.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix redundant building

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* PP > 2 for NMT

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Merge and fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Refactor multilingual again

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor and verify data formats

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleanup

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* more fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix passing langs

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes for bart

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@cs.toronto.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* ONNX export working

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fixing unit test

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Update isort to the latest version (#5895)

Update isort to the latest version

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Pin isort version (#5914)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Moved eval notebook data to aws (#5911)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* FilterbankFeaturesTA to match FilterbankFeatures (#5913)

Signed-off-by: Mohamed Saad Ibn Seddik <ms.ibnseddik@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* fixed missing long_description_content_type (#5909)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* added TPMLP for T5-based models (#5840) (#5841)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fixing 0-size issue and ONNX BS>1 trace

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fixing code scan alert

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* update container (#5917)

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* remove conda pynini install (#5921)

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Merge release main (#5916)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* added TPMLP for T5-based models (#5840)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* remove notebook (#5859)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Dynamic freezing in Nemo (#5879)

* Initial commit for dynamic freezing logic

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated logic to handle lists and updated docs

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Transferred dynamic freezing logic to core from asr

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert asr config to original

Signed-off-by: Daniel Egert <degert@nvidia.com>

* Fixed tab indent in core.rst

Signed-off-by: Daniel Egert <degert@nvidia.com>

* Updated modelPT for latest from master

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed indents in docs

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix Windows bug with save_restore_connector (#5919)

* Initial commit for Windows bug with save_to

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* add new lannguages to doc (#5939)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Workarounds for ONNX export with autocast

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* fix val loss computation in megatron (#5871)

* fix val loss computation in megatron

* Fix NaN handling during validation

---------

Co-authored-by: ANMOL GUPTA <anmolg@nvidia.com>
Co-authored-by: Mikołaj Błaż <mblaz@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Restoring sigmas

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Add core classes and functions for online clustering diarizer part 2 (#5609)

* Add core classes and functions for online clustering diarizer

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add audio to labels code

Signed-off-by: Taejin Park <tango4j@gmail.com>

* resolve type errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added unit=tests for very short audio

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Filled all missing docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* resolved conflict and added missing docstrings

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed unit-test errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix the wrongly added file - megatron_gpt_model.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fix wrongly included file - megatron_gpt_model.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* resolve code quality issue

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed unit-test errors and bugs

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* changed total_sec for offline_clustering toy_data in unit-tests

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed merging index offset bug

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* only including part 1 files

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused function

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fixed unused imports

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* divided nmesc_clustering.py into two and reflected first-pass comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding offline/online_clustering.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix code QL autocomment

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Removed unused imports

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update nemo/collections/asr/parts/utils/online_clustering.py

Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>

* Reflected comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* resolved code scanning issue

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Adding online_diarizer.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

* updated tests and speaker_utils

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixed the wrong test eval

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updating online diarizer for varialbe name change

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Reflected comments and some typo fixes in speaker_utils

Signed-off-by: Taejin Park <tango4j@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [TTS][ZH] added new NGC model cards with polyphone disambiguation. (#5940)

* [TTS][ZH] added new NGC model cards with polyphone disambiguation.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Moved truncation of context higher up

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [TN] bugfix file handler is not closed. (#5955)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Added unit test for regulate_len. Unscripted sort_tensor for TRT

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fixed slice

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [TTS] deprecate AudioToCharWithPriorAndPitchDataset. (#5959)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix: file handlers are not closed. (#5956)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [TTS][G2P] deprecate add_symbols (#5961)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* fix broken link (#5968)

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix hybridasr bug (#5950) (#5957)

Signed-off-by: Jason <jasoli@nvidia.com>

* Added list_available_models (#5967)

* Added list_available_models

Signed-off-by: Evgeniy Shabalin <36159472+treacker@users.noreply.github.com>

* Added to readme

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added vits to docs

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

* added vits to docs

Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>

---------

Signed-off-by: Evgeniy Shabalin <36159472+treacker@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Signed-off-by: Jason <jasoli@nvidia.com>

* Move settings to `pyproject.toml`. Remove deprecated `pytest-runner` (#5947)

* Move project settings to pyproject.toml

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Remove setup.cfg

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Remove deprecated pytest-runner

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Add comments

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Allow only registered markers for pytest

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Fix torchaudio installation (#5850)

* Fail if torchaudio not installed

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix torchaudio matching version

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Warn if Pytorch major version changed

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Update fastpitch.py (#5969)

Signed-off-by: Jason <jasoli@nvidia.com>

* Review comments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b7757e0f935dacde2840f234193c632a2b3be.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* update container in readme (#5981)

Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Support Alignment Extraction for all RNNT Beam decoding methods (#5925)

* Partial impl of ALSD alignment extraction

Signed-off-by: smajumdar <titu1994@gmail.com>

* Partial impl of ALSD alignment extraction

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove everything else

Signed-off-by: smajumdar <titu1994@gmail.com>

* Support dataclass in AbstractRNNTDecoding

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add first draft unittest

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct the logic to more to the next timestep in the alignment

Signed-off-by: smajumdar <titu1994@gmail.com>

* Finalize ALSD alignment generation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for TSD greedy alignment extraction

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for mAES greedy alignment extraction

Signed-off-by: smajumdar <titu1994@gmail.com>

* Finalize extraction of alignments from all beam algorithms for RNNT

Signed-off-by: smajumdar <titu1994@gmail.com>

* Style fixes

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add copyright

Signed-off-by: smajumdar <titu1994@gmail.com>

* Address comments

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Add AWS SageMaker ASR Examples (#5638)

* Base code for AWS SageMaker example

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Remove format

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* wrap

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add a notebook with the code

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Setup

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Update notebook

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Remove space

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix spelling mistake

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add message to explain usage

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add CommonVoice esperanto example

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix path

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fixes

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Import sox locally, add documentation

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add cell to download the SSL model

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Set max epochs to 300

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fixes, introduce HF dataset instructions

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Upstream updates from other branch

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix warning

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add README, add image

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix warning

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address feedback

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Feedback

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

---------

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* Update PUBLICATIONS.md (#5963)

* Add papers from 2022/2022 to PUBLICATIONS.md

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove ipynb from being tracked as for nemo code library

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove ipynb from being tracked as for nemo code library

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add additional papers

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [G2P] fixed typos and broken import library. (#5978) (#5979)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

* [G2P] added backward compatibility for english tokenizer and fixed unit tests (#5980) (#5984)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>

---------

Signed-off-by: Micha Livne <mlivne@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
Signed-off-by: Roman Korostik <racoiaws@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: gabitza-tech <gabriel.pirlogeanu@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: athitten <abhishreetm@gmail.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: Mohamed Saad Ibn Seddik <ms.ibnseddik@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Evgeniy Shabalin <36159472+treacker@users.noreply.github.com>
Signed-off-by: Evgeniy Shabalin <baah1999@yandex.ru>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Roman Korostik <racoiaws@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Mikyas Desta <miktekabi@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Gabriel Pirlogeanu <53811655+gabitza-tech@users.noreply.github.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: ANMOL GUPTA <anmolg@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: athitten <47577437+athitten@users.noreply.github.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: CaraDuf <91517923+Ca-ressemble-a-du-fake@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Micha Livne <mlivne@cs.toronto.edu>
Co-authored-by: Mohamed Saad Ibn Seddik <ms.ibnseddik@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com>
Co-authored-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Mikołaj Błaż <mblaz@nvidia.com>
Co-authored-by: Evgeniy Shabalin <36159472+treacker@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
  • Loading branch information
Show file tree
Hide file tree
Showing 11 changed files with 816 additions and 140 deletions.
275 changes: 275 additions & 0 deletions examples/tts/conf/rad-tts_dec_ipa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
name: RadTTS
sample_rate: 22050

train_dataset: ???
validation_datasets: ???
ckpt_path: None
export_dir: ???
sup_data_path: ???
sup_data_types: ["log_mel", "align_prior_matrix", "pitch", "voiced_mask", "p_voiced", "energy"]



# these frame-wise values depend on pitch_fmin and pitch_fmax, you can get values
# by running `scripts/dataset_processing/tts/extract_sup_data.py`
pitch_mean: ??? # e.g. 212.35873413085938 for LJSpeech
pitch_std: ??? # e.g. 68.52806091308594 for LJSpeech

# default values from librosa.pyin
pitch_fmin: 65.40639132514966
pitch_fmax: 2093.004522404789

# default values for sample_rate=22050
n_mels: 80
n_window_size: 1024
n_window_stride: 256
n_fft: 1024
lowfreq: 0
highfreq: 8000
window: "hann"


phoneme_dict_path: "scripts/tts_dataset_files/ipa_cmudict-0.7b_nv22.10.txt"
heteronyms_path: "scripts/tts_dataset_files/heteronyms-052722"
whitelist_path: "nemo_text_processing/text_normalization/en/data/whitelist/lj_speech.tsv"
mapping_file_path: ""

model:
target: nemo.collections.tts.models.RadTTSModel
bin_loss_start_ratio: 0.2
bin_loss_warmup_epochs: 100

symbols_embedding_dim: 384
n_mel_channels: ${n_mels}

pitch_mean: ${pitch_mean}
pitch_std: ${pitch_std}

text_normalizer:
_target_: nemo_text_processing.text_normalization.normalize.Normalizer
lang: en
input_case: cased
whitelist: ${whitelist_path}

text_normalizer_call_kwargs:
verbose: false
punct_pre_process: true
punct_post_process: true

text_tokenizer:
_target_: nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.IPATokenizer
punct: true
apostrophe: true
pad_with_space: true
g2p:
_target_: nemo_text_processing.g2p.modules.IPAG2P
phoneme_dict: ${phoneme_dict_path}
heteronyms: ${heteronyms_path}
phoneme_probability: 0.5
# Relies on the heteronyms list for anything that needs to be disambiguated
ignore_ambiguous_words: true
use_chars: true
use_stresses: true

train_ds:
dataset:
_target_: "nemo.collections.tts.torch.data.TTSDataset"
manifest_filepath: ${train_dataset}
sample_rate: ${sample_rate}
sup_data_path: ${sup_data_path}
sup_data_types: ${sup_data_types}
n_fft: ${n_fft}
win_length: ${n_window_size}
hop_length: ${n_window_stride}
window: ${window}
n_mels: ${n_mels}
lowfreq: ${lowfreq}
highfreq: ${highfreq}
max_duration: null
min_duration: 0.1
ignore_file: null
trim: False
pitch_fmin: ${pitch_fmin}
pitch_fmax: ${pitch_fmax}



text_tokenizer:
_target_: "nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.EnglishPhonemesTokenizer"
punct: True
stresses: True
chars: True
space: ' '
silence: null
apostrophe: True
sep: '|'
add_blank_at: null
pad_with_space: True
g2p:
_target_: "nemo_text_processing.g2p.modules.EnglishG2p"
phoneme_dict: ${phoneme_dict_path}
heteronyms: ${heteronyms_path}
phoneme_probability: 0.5
dataloader_params:
drop_last: false
shuffle: true
batch_size: 8
num_workers: 8
pin_memory: false

validation_ds:
dataset:
_target_: "nemo.collections.tts.torch.data.TTSDataset"
manifest_filepath: ${validation_datasets}
sample_rate: ${sample_rate}
sup_data_path: ${sup_data_path}
sup_data_types: ${sup_data_types}
n_fft: ${n_fft}
win_length: ${n_window_size}
hop_length: ${n_window_stride}
window: ${window}
n_mels: ${n_mels}
lowfreq: ${lowfreq}
highfreq: ${highfreq}
max_duration: null
min_duration: 0.1
ignore_file: null
trim: False
pitch_fmin: ${pitch_fmin}
pitch_fmax: ${pitch_fmax}

text_tokenizer:
_target_: "nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.EnglishPhonemesTokenizer"
punct: True
stresses: True
chars: True
space: ' '
silence: null
apostrophe: True
sep: '|'
add_blank_at: null
pad_with_space: True
g2p:
_target_: "nemo_text_processing.g2p.modules.EnglishG2p"
phoneme_dict: ${phoneme_dict_path}
heteronyms: ${heteronyms_path}
phoneme_probability: 0.5
dataloader_params:
drop_last: false
shuffle: false
batch_size: 8
num_workers: 8
pin_memory: false

optim:
name: RAdam
lr: 0.0001
betas: [0.9, 0.98]
weight_decay: 0.000001

sched:
name: exp_decay
warmup_steps: 40000
last_epoch: -1
d_model: 1 # Disable scaling based on model dim
trainerConfig:
sigma: 1
iters_per_checkpoint: 3000
seed: null
ignore_layers: []
finetune_layers: []
include_layers: []
with_tensorboard: true
dur_loss_weight: 1
ctc_loss_weight: 1
mask_unvoiced_f0: false
log_step: 1
binarization_start_iter: 6000
kl_loss_start_iter: 18000
loss_weights:
ctc_loss_weight: 0.1
dur_loss_weight: 1.0
f0_loss_weight: 1.0
energy_loss_weight: 1.0
vpred_loss_weight: 1.0
unfreeze_modules: "all"

load_from_checkpoint: False
init_from_ptl_ckpt: ${ckpt_path}
modelConfig:
_target_: "nemo.collections.tts.modules.radtts.RadTTSModule"
n_speakers: 1
n_speaker_dim: 16
n_text: 384 #185
n_text_dim: 512
n_flows: 8
n_conv_layers_per_step: 4
n_mel_channels: 80
n_hidden: 1024
mel_encoder_n_hidden: 512
dummy_speaker_embedding: false
n_early_size: 2
n_early_every: 2
n_group_size: 2
affine_model: wavenet
include_modules: "decatnvpred"
scaling_fn: tanh
matrix_decomposition: LUS
learn_alignments: true
use_context_lstm: true
context_lstm_norm: spectral
context_lstm_w_f0_and_energy: true
text_encoder_lstm_norm: spectral
n_f0_dims: 1
n_energy_avg_dims: 1
use_first_order_features: false
unvoiced_bias_activation: "relu"
decoder_use_partial_padding: false
decoder_use_unvoiced_bias: true
ap_pred_log_f0: true
ap_use_unvoiced_bias: true
ap_use_voiced_embeddings: true
dur_model_config: null
f0_model_config: null
energy_model_config: null
v_model_config :
name : dap
hparams :
n_speaker_dim : 16
take_log_of_input: false
bottleneck_hparams:
in_dim: 512
reduction_factor: 16
norm: weightnorm
non_linearity: relu
arch_hparams:
out_dim: 1
n_layers: 2
n_channels: 256
kernel_size: 3
p_dropout: 0.5

trainer:
devices: 8
precision: 16
max_epochs: 1000
num_nodes: 1
accelerator: gpu
strategy: ddp
accumulate_grad_batches: 1
enable_checkpointing: False
logger: False
gradient_clip_val: 1
log_every_n_steps: 100
check_val_every_n_epoch: 5

exp_manager:
exp_dir: ${export_dir}
name: ${name}
create_tensorboard_logger: True
create_checkpoint_callback: True
checkpoint_callback_params:
monitor: val/loss_ctc
mode: min
filepath: ${export_dir}
filename: model_checkpoint
Loading

0 comments on commit 4827060

Please sign in to comment.