-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update speechllm #8486
Merged
stevehuang52
merged 562 commits into
modular_speechllm
from
heh/modular_speechlm_nightly
Feb 22, 2024
Merged
update speechllm #8486
stevehuang52
merged 562 commits into
modular_speechllm
from
heh/modular_speechlm_nightly
Feb 22, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org>
Signed-off-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org>
* Fix bug wrt change decoding strategy for bpe models * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Co-authored-by: Eric Harper <complex451@gmail.com>
* add conversion script Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove references to 'ckpt' Signed-off-by: Chen Cui <chcui@nvidia.com> * add one more sanity check to make sure there is no unexpected keys in state dict Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make cpu loading work Signed-off-by: Chen Cui <chcui@nvidia.com> * make script work for llama2 models Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address code check Signed-off-by: Chen Cui <chcui@nvidia.com> * remove trainer precision (was for old sanity check) Signed-off-by: Chen Cui <chcui@nvidia.com> * fix script for llama2 model Signed-off-by: Chen Cui <chcui@nvidia.com> * remove commented code Signed-off-by: Chen Cui <chcui@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com>
… dim (#7785) Signed-off-by: anferico <f.cariaggi4@gmail.com>
* Add some docs and update scripts Signed-off-by: smajumdar <titu1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* set context for text memmap to fork Signed-off-by: arendu <adithyare@nvidia.com> * typo Signed-off-by: arendu <adithyare@nvidia.com> --------- Signed-off-by: arendu <adithyare@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
* Add flash-decoding Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
* Change accelerator to 'auto' in nlp_checkpoint_port.py (#7747) * Change accelerator to auto Signed-off-by: Abhishree <abhishreetm@gmail.com> * Pass omegaconf object to trainer in nlp_checkpoint_port.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Pass omegaconf object to trainer in export.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> Signed-off-by: Abhishree <abhishreetm@gmail.com> * docs: fix typos (#7758) Signed-off-by: shuoer86 <129674997+shuoer86@users.noreply.github.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: Abhishree <abhishreetm@gmail.com> * Snake act (#7736) Signed-off-by: Abhishree <abhishreetm@gmail.com> * Update gpt_dataset.py (#6963) Signed-off-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Abhishree <abhishreetm@gmail.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: shuoer86 <129674997+shuoer86@users.noreply.github.com> Signed-off-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: shuoer86 <129674997+shuoer86@users.noreply.github.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
… submodule (#7788) * add selection criteria for reference audios Signed-off-by: anferico <f.cariaggi4@gmail.com> * Update configuration files Signed-off-by: anferico <f.cariaggi4@gmail.com> * add informative comment in config files Signed-off-by: anferico <f.cariaggi4@gmail.com> * sample random index for reference audio selection Signed-off-by: anferico <f.cariaggi4@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: anferico <f.cariaggi4@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* update text server to support compute logprobs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo --------- Signed-off-by: Zhilin Wang <zhilinw@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
* [TTS] Support audio offsets in TTS data loaders Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Change docstring mentions of .pt to .npy Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com>
* move core install to /workspace (#7706) * update apex install in dockerfile * use fetch head --------- Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: eharper <eharper@nvidia.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
* Create config_llama_truncate.yaml Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * Add files via upload Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * Update convert_nemo_llama_to_hf.py Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update config_llama_truncate.yaml Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * Update convert_nemo_llama_to_hf.py Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update convert_nemo_llama_to_hf.py Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> * clean up trainer * remove dependency on yaml config. load config from nemo file instead. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * enable ckpt saving into other precision formats * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * support 70b + cleanup qkv slice logic * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug * move hf model folder code from comment to function and add instruction to run * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Utkarsh <49331882+uppalutkarsh@users.noreply.github.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Gerald Shen <geshen@nvidia.com>
* fix duplex tn infer Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix typo Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix TN docs Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* update transformers cache Signed-off-by: eharper <eharper@nvidia.com> * update Signed-off-by: eharper <eharper@nvidia.com> * add cd Signed-off-by: eharper <eharper@nvidia.com> --------- Signed-off-by: eharper <eharper@nvidia.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
* add finetune with huggingface dataset Signed-off-by: stevehuang52 <heh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update yaml Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> * update and refactor Signed-off-by: stevehuang52 <heh@nvidia.com> * add extrac hf text and update Signed-off-by: stevehuang52 <heh@nvidia.com> * update and refactor Signed-off-by: stevehuang52 <heh@nvidia.com> * move dataset dependency to common Signed-off-by: stevehuang52 <heh@nvidia.com> * add docstring Signed-off-by: stevehuang52 <heh@nvidia.com> * Add to Dics Signed-off-by: Nithin Rao Koluguri <nithinraok> * add ci test Signed-off-by: Nithin Rao Koluguri <nithinraok> * add max steps in jenkins Signed-off-by: Nithin Rao Koluguri <nithinraok> * reduce max steps Signed-off-by: Nithin Rao Koluguri <nithinraok> * jenkins test Signed-off-by: Nithin Rao Koluguri <nithinraok> * add bs=2 Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: stevehuang52 <heh@nvidia.com> Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
* ControlNet TRT export * Final MR before release * SD2 update * Fixed export issue * Fix for instruct p2p and reformat * Fix SD export issue * Add nemo clip export for DB * Fix ins pix2pix * fix sd2 config * [Mingyuan Ma] BF16 and SD conversion script * [Imagen] NHWC Feature * Fix .nemo loading issue for NeMo CLIP in SD * NeMo r1.20.0 Multimodal Merge * fix the inductor issue in inference * Fix inductor loading .nemo issue * Add Neva Model Support * Imagen Optimizations * Neva inference code * NeMo TOT 1.21 to Internal/main * Update neva_inference.yaml * REBASING for latest code changes * Update internal/main to main tot * Parallel DDIM implementation * 1. Fixing indentation bug. (#7352) Signed-off-by: Micha Livne <mlivne@nvidia.com> * NeMo MCore llama2 support + MCore PEFT adapters (#7299) * start adding gpt from megatron core path Signed-off-by: ericharper <complex451@gmail.com> * set model parallel config Signed-off-by: ericharper <complex451@gmail.com> * use model parallel config object Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update args Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * set vp size to none if it is 1 Signed-off-by: ericharper <complex451@gmail.com> * set vp size to none if it is 1 Signed-off-by: ericharper <complex451@gmail.com> * add TransformerConfig Signed-off-by: ericharper <complex451@gmail.com> * start updating to TransformerConfig Signed-off-by: ericharper <complex451@gmail.com> * add todo Signed-off-by: ericharper <complex451@gmail.com> * revert to model parallel config Signed-off-by: ericharper <complex451@gmail.com> * add hidden_size to model_parallel_config Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove imports Signed-off-by: ericharper <complex451@gmail.com> * revert Signed-off-by: ericharper <complex451@gmail.com> * remove import Signed-off-by: ericharper <complex451@gmail.com> * small clean up Signed-off-by: ericharper <complex451@gmail.com> * update hidden size in peft base model, add mcore commit to jenkins Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update module args Signed-off-by: ericharper <complex451@gmail.com> * add config obj to flash attention tests Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove args Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove sequence parallel arg Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * add config to self Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * add config to test Signed-off-by: ericharper <complex451@gmail.com> * get hidden_size from config Signed-off-by: ericharper <complex451@gmail.com> * add try except Signed-off-by: ericharper <complex451@gmail.com> * use default Signed-off-by: ericharper <complex451@gmail.com> * update config with hidden size Signed-off-by: ericharper <complex451@gmail.com> * remove arg Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out jenkins test Signed-off-by: ericharper <complex451@gmail.com> * revert import Signed-off-by: ericharper <complex451@gmail.com> * build transformer config Signed-off-by: ericharper <complex451@gmail.com> * add model to provider func Signed-off-by: ericharper <complex451@gmail.com> * update forward and float16 wrapper Signed-off-by: ericharper <complex451@gmail.com> * instantiate model parallel config after init model parallel Signed-off-by: ericharper <complex451@gmail.com> * set virtual rank Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add GQA config to megatron gpt model (#7096) * Add GQA config in gpt config file Signed-off-by: jasonwan <jasonwan@nvidia.com> * Verify mcore is enabled when using GQA Signed-off-by: jasonwan <jasonwan@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> * revert Signed-off-by: ericharper <complex451@gmail.com> * mcore llama2 ckpt conversion & small fix Signed-off-by: jasonwan <jasonwan@nvidia.com> * Add inference & sft config by Hongbin Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> Signed-off-by: jasonwan <jasonwan@nvidia.com> * fix config Signed-off-by: jasonwan <jasonwan@nvidia.com> * add inference param. update TP/PP script to support mcore gpt Signed-off-by: jasonwan <jasonwan@nvidia.com> * p-tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * modify ckpt conversion script (adding model cast) Signed-off-by: jasonwan <jasonwan@nvidia.com> * ckpt conversion use relative path for config Signed-off-by: jasonwan <jasonwan@nvidia.com> * start adding gpt from megatron core path Signed-off-by: ericharper <complex451@gmail.com> * set model parallel config Signed-off-by: ericharper <complex451@gmail.com> * use model parallel config object Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * set vp size to none if it is 1 Signed-off-by: ericharper <complex451@gmail.com> * set vp size to none if it is 1 Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add TransformerConfig Signed-off-by: ericharper <complex451@gmail.com> * start updating to TransformerConfig Signed-off-by: ericharper <complex451@gmail.com> * add todo Signed-off-by: ericharper <complex451@gmail.com> * revert to model parallel config Signed-off-by: ericharper <complex451@gmail.com> * add hidden_size to model_parallel_config Signed-off-by: ericharper <complex451@gmail.com> * remove imports Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove import Signed-off-by: ericharper <complex451@gmail.com> * small clean up Signed-off-by: ericharper <complex451@gmail.com> * update hidden size in peft base model, add mcore commit to jenkins Signed-off-by: ericharper <complex451@gmail.com> * update module args Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add config obj to flash attention tests Signed-off-by: ericharper <complex451@gmail.com> * remove args Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove sequence parallel arg Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update args Signed-off-by: ericharper <complex451@gmail.com> * add config to self Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * add config to test Signed-off-by: ericharper <complex451@gmail.com> * get hidden_size from config Signed-off-by: ericharper <complex451@gmail.com> * add try except Signed-off-by: ericharper <complex451@gmail.com> * use default Signed-off-by: ericharper <complex451@gmail.com> * update config with hidden size Signed-off-by: ericharper <complex451@gmail.com> * remove arg Signed-off-by: ericharper <complex451@gmail.com> * comment out jenkins test Signed-off-by: ericharper <complex451@gmail.com> * revert import Signed-off-by: ericharper <complex451@gmail.com> * remove optimizer_idx Signed-off-by: eharper <eharper@nvidia.com> * prefetch num microbatches Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * start adding gpt from megatron core path Signed-off-by: ericharper <complex451@gmail.com> * set model parallel config Signed-off-by: ericharper <complex451@gmail.com> * use model parallel config object Signed-off-by: ericharper <complex451@gmail.com> * update args Signed-off-by: ericharper <complex451@gmail.com> * fix for p-tuning sequence parallel Signed-off-by: jasonwan <jasonwan@nvidia.com> * support SFT/distOpt mcore (#7207) * add inference param. update TP/PP script to support mcore gpt * p-tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * change layer names for SFT Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * fix bug in SFT Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> Co-authored-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * start updating to TransformerConfig Signed-off-by: ericharper <complex451@gmail.com> * revert to model parallel config Signed-off-by: ericharper <complex451@gmail.com> * add hidden_size to model_parallel_config Signed-off-by: ericharper <complex451@gmail.com> * remove imports Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update module args Signed-off-by: ericharper <complex451@gmail.com> * add config to self Signed-off-by: ericharper <complex451@gmail.com> * build transformer config Signed-off-by: ericharper <complex451@gmail.com> * add model to provider func Signed-off-by: ericharper <complex451@gmail.com> * update forward and float16 wrapper Signed-off-by: ericharper <complex451@gmail.com> * instantiate model parallel config after init model parallel Signed-off-by: ericharper <complex451@gmail.com> * set virtual rank Signed-off-by: ericharper <complex451@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add GQA config to megatron gpt model (#7096) * Add GQA config in gpt config file Signed-off-by: jasonwan <jasonwan@nvidia.com> * Verify mcore is enabled when using GQA Signed-off-by: jasonwan <jasonwan@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> * revert Signed-off-by: ericharper <complex451@gmail.com> * remove import Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rollback model cast for p-tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * update for dist adam Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use get_gpt_module_list Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update ckpt conversion script Signed-off-by: jasonwan <jasonwan@nvidia.com> * ptl2.0 patch for llama config Signed-off-by: jasonwan <jasonwan@nvidia.com> * add plugins to trainer in scripts Signed-off-by: jasonwan <jasonwan@nvidia.com> * fix activation checkpointing mcore Signed-off-by: jasonwan <jasonwan@nvidia.com> * fix variable names Signed-off-by: jasonwan <jasonwan@nvidia.com> * overwrite normalization type for mcore/te Signed-off-by: jasonwan <jasonwan@nvidia.com> * Update megatron_llama_sft.yaml Signed-off-by: Jason Wang <jasonwan@nvidia.com> * add PEFT adapter support for mcore gpt path (#7276) * implementation for mcore adapter/mxins Signed-off-by: jasonwan <jasonwan@nvidia.com> * small fix for lora and ptuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * support layerwise peft Signed-off-by: jasonwan <jasonwan@nvidia.com> * support multiple target layers Signed-off-by: jasonwan <jasonwan@nvidia.com> * support lora GQA Signed-off-by: jasonwan <jasonwan@nvidia.com> * support amp O2 Signed-off-by: jasonwan <jasonwan@nvidia.com> * revert & more O2 fix Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * lora inject to attention Signed-off-by: jasonwan <jasonwan@nvidia.com> * support lora weight tying Signed-off-by: jasonwan <jasonwan@nvidia.com> * add copyright header Signed-off-by: jasonwan <jasonwan@nvidia.com> * rollback ptuning name change. full string match mcore target Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove comment Signed-off-by: jasonwan <jasonwan@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * clean up config Signed-off-by: jasonwan <jasonwan@nvidia.com> * Sync llama branch (#7297) * add inference param. update TP/PP script to support mcore gpt * p-tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * change layer names for SFT Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * fix bug in SFT Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * fix bug: cpu initialization is not really enabled Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * add use_cpu_initialization to TransformerConfig Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * fix bug: wrong config path when using relative cjpt path Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> * revert mcore config change Signed-off-by: Jason Wang <jasonwan@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> Signed-off-by: Jason Wang <jasonwan@nvidia.com> Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> * clean up ckpt conversion script Signed-off-by: jasonwan <jasonwan@nvidia.com> * rollback git merge errors Signed-off-by: jasonwan <jasonwan@nvidia.com> * update mcore, add check for mcore+te Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * formatting Signed-off-by: jasonwan <jasonwan@nvidia.com> * make sft test dataset optional. fix indentation in config Signed-off-by: jasonwan <jasonwan@nvidia.com> * one more fix for optional test set Signed-off-by: jasonwan <jasonwan@nvidia.com> * support merging lora weights in mcore Signed-off-by: jasonwan <jasonwan@nvidia.com> * update mcore for cpu init Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update ckpt conversion for code llama Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add seq_len_interpolation_factor support for long-context llama ckpts (#7312) * add inference param. update TP/PP script to support mcore gpt * p-tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * add seq_len_interpolation_factor Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> Co-authored-by: jasonwan <jasonwan@nvidia.com> Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> * fix old ptuning model, update mcore to support seq_len_interpolation_factor Signed-off-by: jasonwan <jasonwan@nvidia.com> * support fused layernorm linear, fix ptuning O2 Signed-off-by: jasonwan <jasonwan@nvidia.com> * drop loss mask for mcore for now Signed-off-by: jasonwan <jasonwan@nvidia.com> * disable dist ckpt in peft Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix loading non dist ckpt Signed-off-by: jasonwan <jasonwan@nvidia.com> * add ckpt conversion to CI Signed-off-by: jasonwan <jasonwan@nvidia.com> * update CI Signed-off-by: jasonwan <jasonwan@nvidia.com> * mcore_mixin docstring Signed-off-by: jasonwan <jasonwan@nvidia.com> * minor change in mcore peft error message Signed-off-by: jasonwan <jasonwan@nvidia.com> * fix amp o2 in lora weight tying Signed-off-by: jasonwan <jasonwan@nvidia.com> * correct mcore fp8 config Signed-off-by: jasonwan <jasonwan@nvidia.com> * add TE installation Signed-off-by: jasonwan <jasonwan@nvidia.com> * support mcore adapter tuning Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out new CI test. rollback docker image Signed-off-by: jasonwan <jasonwan@nvidia.com> * ignore FA tests, try new CI on 23.08 Signed-off-by: jasonwan <jasonwan@nvidia.com> * mark new CI as L2, put to beginning to test Signed-off-by: jasonwan <jasonwan@nvidia.com> * minor fix for prompt learning Signed-off-by: jasonwan <jasonwan@nvidia.com> * rollback to 23.06. comment out CI Signed-off-by: jasonwan <jasonwan@nvidia.com> * minor fix ckpt conversion script Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor rollback gpt model change Signed-off-by: jasonwan <jasonwan@nvidia.com> --------- Signed-off-by: ericharper <complex451@gmail.com> Signed-off-by: jasonwan <jasonwan@nvidia.com> Signed-off-by: eharper <eharper@nvidia.com> Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> Signed-off-by: Jason Wang <jasonwan@nvidia.com> Co-authored-by: ericharper <complex451@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: eharper <eharper@nvidia.com> Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> Co-authored-by: Kelvin Liu <lhb8125@users.noreply.github.com> * Hiddens modules documentation (#7303) * 1. Changed hiddens transformations module from `transformations` to `hiddens`. Signed-off-by: Micha Livne <mlivne@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 1. Finished doc. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> * 1. Debugging. Signed-off-by: Micha Livne <mlivne@nvidia.com> --------- Signed-off-by: Micha Livne <mlivne@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> * Support for flash attention 2.0 (#7063) * Add flash attn 2 Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add FA2 feature Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove debugging Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> Co-authored-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * lora merge fix for O2 names (#7325) * wip Signed-off-by: arendu <adithyare@nvidia.com> * adjust key names based on O2 Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <adithyare@nvidia.com> * minor Signed-off-by: arendu <adithyare@nvidia.com> --------- Signed-off-by: arendu <adithyare@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * multiple fields can form a context (#7147) * list of context fields and flexible prompt template Signed-off-by: arendu <adithya.r@gmail.com> * list of fields for context Signed-off-by: arendu <adithya.r@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Add multiple truncation fields and middle truncation Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Compatible to old ckpt Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix tokenize detokenize issue Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove detokenization, add truncation augmentation Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve comments Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove unused import Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert eos Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Add tokenizer space_sensitive attribute Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix error Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Fix erorr and use re Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Change assert logic Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Follow adi suggestion Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove merge function Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add example and comment Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove context_key and add comment Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove random truncation Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix template none Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: arendu <adithya.r@gmail.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> * Load buffers in checkpoint (#7357) Signed-off-by: Jason Wang <jasonwan@nvidia.com> * Add migration guide for lightning 2.0 upgrade (#7360) * Add lightning 2.0 migration guide in NeMo docs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add remaining guide for lightning 2.0 upgrade Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove line spill over and continue in next line Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add missing dataloader_iter in the guide Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix minor typo Signed-off-by: Abhishree <abhishreetm@gmail.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> * adding bias_dropout_add_fusion option for BERT (#7332) Signed-off-by: Alexander Jipa <azzhipa@amazon.com> Co-authored-by: Alexander Jipa <azzhipa@amazon.com> * [TTS] Change audio codec token type to TokenIndex (#7356) Signed-off-by: Ryan <rlangman@nvidia.com> * enable selective unfreeze (#7326) * wip Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * wip Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * avoid PTL method conflicts Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: arendu <adithyare@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix typos (#7361) * fix typos Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typo Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typos Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typos Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typo Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typos Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typo Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typo Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * fix typo Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> --------- Signed-off-by: omahs <73983677+omahs@users.noreply.github.com> * pin numba=0.57.1 to fix reinstall.sh error (#7366) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Update new conversion script for converting safetensors. * Upgrade pytorch container to 23.08 (#7353) * upgrade pytorch container Signed-off-by: eharper <eharper@nvidia.com> * use mcore Signed-off-by: eharper <eharper@nvidia.com> * revert test change Signed-off-by: eharper <eharper@nvidia.com> * pleasefixme Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * check for ampere Signed-off-by: eharper <eharper@nvidia.com> * comment test temporarily Signed-off-by: eharper <eharper@nvidia.com> --------- Signed-off-by: eharper <eharper@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * enable fp32 optimizer for output_layer in mcore (#7355) Signed-off-by: lhb8125 <lhb8125@gmail.com> * revert comment (#7368) Signed-off-by: eharper <eharper@nvidia.com> * Update to core 23.08 branch ToT (#7371) Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * upper bounding ptl (#7370) Signed-off-by: eharper <eharper@nvidia.com> * fix pipeline parallel inference (#7367) * fix pp inference Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix for peft tied weights (#7372) Signed-off-by: arendu <adithyare@nvidia.com> * fixed trainer.strategy=auto from None. (#7369) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * add O2 option in gpt eval (#7358) * add O2 option in eval Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add doc for O2 config Signed-off-by: jasonwan <jasonwan@nvidia.com> * add to llama inference config Signed-off-by: jasonwan <jasonwan@nvidia.com> --------- Signed-off-by: jasonwan <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> * Move model precision copy (#7336) * move cfg precision set to megatron base model Signed-off-by: Maanu Grover <maanug@nvidia.com> * remove copy from other models Signed-off-by: Maanu Grover <maanug@nvidia.com> * modify attribute not arg Signed-off-by: Maanu Grover <maanug@nvidia.com> * fix gpt model test for ptl 2.0 Signed-off-by: Maanu Grover <maanug@nvidia.com> * rename function and add docstring Signed-off-by: Maanu Grover <maanug@nvidia.com> * replace precision to dtype conditionals with func call Signed-off-by: Maanu Grover <maanug@nvidia.com> * unnecessary function and cfg reset Signed-off-by: Maanu Grover <maanug@nvidia.com> * set default value Signed-off-by: Maanu Grover <maanug@nvidia.com> * fix precision lookup in a few more places Signed-off-by: Maanu Grover <maanug@nvidia.com> * rename mapping function Signed-off-by: Maanu Grover <maanug@nvidia.com> * ununsed import Signed-off-by: Maanu Grover <maanug@nvidia.com> * save torch datatype to model Signed-off-by: Maanu Grover <maanug@nvidia.com> * set weights precision wrt amp o2 Signed-off-by: Maanu Grover <maanug@nvidia.com> * Revert "set weights precision wrt amp o2" This reverts commit 313a4bfe5eb69d771a6d2433898c0685836aef5c. Signed-off-by: Maanu Grover <maanug@nvidia.com> * revert half precision at inference attempt Signed-off-by: Maanu Grover <maanug@nvidia.com> * move autocast dtype to base model Signed-off-by: Maanu Grover <maanug@nvidia.com> * move params dtype to base model, enable fp16 O2 inf Signed-off-by: Maanu Grover <maanug@nvidia.com> * unused imports Signed-off-by: Maanu Grover <maanug@nvidia.com> --------- Signed-off-by: Maanu Grover <maanug@nvidia.com> * Fix PEFT checkpoint loading (#7388) * Fix PEFT checkpoint loading Signed-off-by: Jason Wang <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jason Wang <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Use distributed optimizer support for multiple dtypes (#7359) * Update distopt wrapper with multiple dtype support Remove manual handling of separate FP32 optimizer. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Use distopt support for contiguous buffers with multiple dtypes Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix typo Signed-off-by: Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Separate distopt buckets for first GPT layer and non-overlapped params Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add distopt logic for int dtypes Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update Apex commit Signed-off-by: Tim Moon <tmoon@nvidia.com> * Remove unused variables Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update Apex commit in README and Jenkensfile Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug Dockerfile and Jenkinsfile Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> * minor fix for llama ckpt conversion script (#7387) * minor fix for llama ckpt conversion script Signed-off-by: Jason Wang <jasonwan@nvidia.com> * Update Jenkinsfile Signed-off-by: Jason Wang <jasonwan@nvidia.com> * remove fast_swiglu configuration Signed-off-by: Jason Wang <jasonwan@nvidia.com> --------- Signed-off-by: Jason Wang <jasonwan@nvidia.com> Co-authored-by: Eric Harper <complex451@gmail.com> * Fix wrong calling of librosa.get_duration() in notebook (#7376) Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> * [PATCH] PEFT import mcore (#7393) * [PATCH] PEFT import mcore Signed-off-by: Jason Wang <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jason Wang <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [TTS] Added a callback for logging initial data (#7384) Signed-off-by: Ante Jukić <ajukic@nvidia.com> * Update Core Commit (#7402) * Update Core Commit Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * update commit Signed-off-by: Abhinav Khattar <aklife97@gmail.com> --------- Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * Use cfg attribute in bert (#7394) * use cfg attribute instead of arg Signed-off-by: Maanu Grover <maanug@nvidia.com> * use torch_dtype in place of cfg.precision Signed-off-by: Maanu Grover <maanug@nvidia.com> * move precision copy before super constructor Signed-off-by: Maanu Grover <maanug@nvidia.com> * use trainer arg Signed-off-by: Maanu Grover <maanug@nvidia.com> --------- Signed-off-by: Maanu Grover <maanug@nvidia.com> * Add support for bias conversion in Swiglu models (#7386) * Add support for bias conversion in Swiglu models Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for auto extracting tokenizer model Signed-off-by: smajumdar <titu1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add support for auto extracting tokenizer model Signed-off-by: smajumdar <titu1994@gmail.com> * Fix issue with missing tokenizer Signed-off-by: smajumdar <titu1994@gmail.com> * Refactor Signed-off-by: smajumdar <titu1994@gmail.com> * Refactor Signed-off-by: smajumdar <titu1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update save_to and restore_from for dist checkpointing (#7343) * add dist ckpt to save to, in progress Signed-off-by: eharper <eharper@nvidia.com> * move dist ckpt Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * clean up Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update restore from, need to figure out how to initialize distributed Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * launch distrib if needed when restoring dist ckpt Signed-off-by: eharper <eharper@nvidia.com> * when using mcore we can change tp pp on the fly Signed-off-by: eharper <eharper@nvidia.com> * add load_from_checkpoint support for dist ckpt Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update llama convert script to save dist .nemo Signed-off-by: eharper <eharper@nvidia.com> * fix load dist ckpt Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * setup TE TP groups if needed Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * setup te tp groups if needed Signed-off-by: eharper <eharper@nvidia.com> * remove import Signed-off-by: eharper <eharper@nvidia.com> --------- Signed-off-by: eharper <eharper@nvidia.com> Signed-off-by: jasonwan <jasonwan@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: jasonwan <jasonwan@nvidia.com> * fix forward for with mcore=false (#7403) Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com> * Fix logging to remove 's/it' from progress bar in Megatron models and add train_step_timing (#7374) * Add CustomProgressBar class to exp_manager and trainer callbacks Signed-off-by: Abhishree <abhishreetm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix the progress bar to reflect total microbatch cnt Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify CustomProgressBar class 1) Modify CustomProgressBar class to update progress bar per global_step instead of per microbatch 2) Add the callback to other megatron training/finetuning files that are not using MegatronTrainerBuilder Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add CustomProgressBar callback to tuning files Signed-off-by: Abhishree <abhishreetm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Set Activation Checkpointing Defaults (#7404) * Set Activation Checkpointing Defaults Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * check for None Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * make loss mask default to false (#7407) Signed-off-by: eharper <eharper@nvidia.com> * Add dummy userbuffer config files (#7408) Signed-off-by: Sangkug Lym <slym@nvidia.com> * add missing ubconf files (#7412) Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * New tutorial on Speech Data Explorer (#7405) * Added Google Colab based tutorial on Speech Data Explorer Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com> * Update ptl training ckpt conversion script to work with dist ckpt (#7416) * update ptl convert script Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * don't break legacy Signed-off-by: eharper <eharper@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <eharper@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Allow disabling sanity checking when num_sanity_val_steps=0 (#7413) * Allow disabling sanity checking when num_sanity_val_steps=0 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Update num_sanity_val_steps to be a multiple of num_microbatches Signed-off-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add comprehensive error messages (#7261) Signed-off-by: Anton Peganov <apeganov@nvidia.com> * check NEMO_PATH (#7418) Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * layer selection for ia3 (#7417) * layer selection for ia3 Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: arendu <adithyare@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix missing pip package 'einops' (#7397) Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * Fix failure of pyaudio in Google Colab (#7396) Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * Update README.md: output_path --> output_manifest_filepath (#7442) Signed-off-by: Samuele Cornell <cornellsamuele@gmail.com> * Updating FlashAttention API to match FlashAttentionV2 * Multiple fixes for mm * Fix CI inductor issue and update to torch compile * Remove suppress error * Fix when conversion config uses fp16 and it complains about precision plugin * Fixing FAv2 API usage * Initial release of content filtering model * Added synthetic dataloader for precached and online mode * Mingyuanm/dreambooth opt * Add llama2 support in neva training * Fix sampler length * Fix all precision issues in nemo multimodal * Add rope dynamic linear scaling (#7437) * Add dynamic linear scaling Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com> * Fix None dataloader issue in PTL2.0 (#7455) * Fix None dataloader issue in PTL2.0 Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updating values of self._validation_dl and self._test_dl as well Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com> * updating values of self._validation_dl and self._test_dl as well Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [ASR] Confidence measure -> method renames (#7434) * measure -> method Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add steps for document of getting dataset 'SF Bilingual Speech' (#7378) * Add steps for document of getting dataset 'SF Bilingual Speech' Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * Update datasets.rst added a link from a tutorial demonstrating detailed data prep steps. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> --------- Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * RNN-T confidence and alignment bugfix (#7381) * new frame_confidence and alignments lists are now always created after the while loop Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * tests added Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> --------- Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * Fix resume from checkpoint in exp_manager (#7424) (#7426) Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com> * Fix checking of cuda/cpu device for inputs of Decoder (#7444) * Fix checking of cuda/cpu device for inputs of Decoder Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * Update tacotron2.py Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Signed-off-by: Jason <jasoli@nvidia.com> Co-authored-by: Jason <jasoli@nvidia.com> * Fix failure of ljspeech's get_data.py (#7430) * Fix failure of ljspeech's get_data.py Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [TTS] Fix audio codec type checks (#7373) * [TTS] Fix audio codec type checks Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Fix audio codec tests Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add dataset to path of logged artifacts (#7462) * [TTS] Add dataset to path of logged artifacts Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Revert axis name back to Audio Frames Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> * Fix sft dataset truncation (#7464) * Add fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Automatic Lip Reading Recognition (ALR) - ASR/CV (Visual ASR) (#7330) * striding_conv1d_k5 and dw_striding_conv1d_k5 subsampling Signed-off-by: mburchi <maxime.burchi@gmail.com> * transpose conv1d inputs Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: mburchi <maxime.burchi@gmail.com> * Update subsampling.py change striding_conv1d_k5 to striding_conv1d Signed-off-by: Maxime Burchi <60737204+burchim@users.noreply.github.com> * cv branch Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * video manifest Signed-off-by: mburchi <maxime.burchi@gmail.com> * add collection classes Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test_step_outputs Signed-off-by: mburchi <maxime.burchi@gmail.com> * correct manifest bug when having only audio or only videos Signed-off-by: mburchi <maxime.burchi@gmail.com> * correct manifest bug when having only audio or only videos Signed-off-by: mburchi <maxime.burchi@gmail.com> * clean references Signed-off-by: mburchi <maxime.burchi@gmail.com> * freeze unfreeze transcribe cv models Signed-off-by: mburchi <maxime.burchi@gmail.com> * correct manifest get_full_path bug Signed-off-by: mburchi <maxime.burchi@gmail.com> * update for PR Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * guard torchvision Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/cv/data/video_to_text_dataset.py Co-authored-by: Igor Gitman <igor.a.gitman@gmail.com> Signed-off-by: Maxime Burchi <60737204+burchim@users.noreply.github.com> * _video_speech_collate_fn in cv/data/video_to_text.py Signed-off-by: mburchi <maxime.burchi@gmail.com> * add self.out = None to asr subsampling Signed-off-by: mburchi <maxime.burchi@gmail.com> * Update nemo/collections/cv/data/video_to_text_dataset.py Co-authored-by: Igor Gitman <igor.a.gitman@gmail.com> Signed-off-by: Maxime Burchi <60737204+burchim@users.noreply.github.com> * cv -> multimodal/speech_cv branch Signed-off-by: mburchi <maxime.burchi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: mburchi <maxime.burchi@gmail.com> Signed-off-by: Maxime Burchi <60737204+burchim@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Igor Gitman <igor.a.gitman@gmail.com> * HF StarCoder to NeMo conversion script (#7421) * Script to convert HF StarCoder checkpoint to NeMo Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * StarCoder conversion test Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Fix test Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Catch up with save_to changes Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Don't abbreviate args for clarity Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Configurable precision: BF16 vs FP32 Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix bug when loading dist ckpt in peft (#7452) Signed-off-by: Hongbin Liu <hongbinl@nvidia.com> Co-authored-by: Hongbin Liu <hongbinl@nvidia.com> * Fix adding positional embeddings in-place in transformer module (#7440) Signed-off-by: Tamerlan Tabolov <tktabolov@gmail.com> Co-authored-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> * Fix (#7478) Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * add sleep (#7498) (#7499) * add sleep * add sleep onto config instead * add comment --------- Signed-off-by: Gerald Shen <geshen@nvidia.com> Co-authored-by: Gerald Shen <119401249+gshennvm@users.noreply.github.com> * Fix exp manager check for sleep (#7503) (#7504) Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> * bugfix: trainer.accelerator=auto from None. (#7492) (#7493) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> * [doc] fix broken link (#7481) Signed-off-by: Stas Bekman <stas00@users.noreply.github.com> * [TTS] Read audio as int32 to avoid flac read errors (#7477) * [TTS] Read audio as int32 to avoid flac read errors Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add comment about read failures Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> * Add dataset 'AISHELL-3' from OpenSLR for training mandarin TTS (#7409) * Add dataset 'AISHELL-3' from OpenSLR for training mandarin TTS * Train 'AISHELL-3' dataset with multi-speakers Signed-off-by: Robin Dong <robin.k.dong@gmail.com> * Update get_data.py update copyright header Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Update get_data.py added a disclaimer Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add new configuration file for AISHELL3 with multispeaker of fastpitch Signed-off-by: Robin Dong <robin.k.dong@gmail.com> --------- Signed-off-by: Robin Dong <robin.k.dong@gmail.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * dllogger - log on rank 0 only (#7513) Signed-off-by: Stas Bekman <stas00@users.noreply.github.com> * Fix TTS FastPitch tutorial (#7494) (#7516) * Fix --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com> * Fix get_dist() tensor dimension (#7506) (#7515) Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com> Co-authored-by: Jocelyn <jocelynh@nvidia.com> * bugfix: specify trainer.strategy=auto when devices=1 (#7509) (#7512) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> * fix (#7511) Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [TTS] Fix FastPitch data prep tutorial (#7524) Signed-off-by: Ryan <rlangman@nvidia.com> * add italian tokenization (#7486) * add italian tokenization Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add more ipa lexicon it Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix error deletion Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add test Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Replace None strategy with auto in tutorial notebooks (#7521) (#7527) Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com> * unpin setuptools (#7534) (#7535) Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> * remove auto generated examples (#7510) * explicitly remove autogenerated examples for data parallel evaluation Signed-off-by: arendu <adithyare@nvidia.com> * mark autogenrated and remove it for test Signed-off-by: arendu <adithyare@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: arendu <adithyare@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add the `strategy` argument to `MegatronGPTModel.generate()` (#7264) It is passed as an explicit argument rather than through `**strategy_args` so as to ensure someone cannot accidentally pass other arguments that would end up being ignored. It is a keyword-only argument to ensure that if in the future we want to update the signature to `**strategy_args`, we can do it without breaking code. Signed-off-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> * Fix PTL2.0 related ASR bugs in r1.21.0: Val metrics logging, None dataloader issue (#7531) (#7533) * fix none dataloader issue ptl2 * ptl2.0 logging fixes for rnnt_models --------- Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com> Co-authored-by: Kunal Dhawan <kunaldhawan97@gmail.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> * gpus -> devices (#7542) (#7545) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> * Update FFMPEG version to fix issue with torchaudio (#7551) (#7553) Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> * PEFT GPT & T5 Refactor (#7308) * initial implementation of add_adapters API * correct type hint * Add config in add_adapters for save and load (@author bobchen) * Remove AdapterConfig to avoid import error * Add AdaterConfig back and move adaptermixin to sft model * Add NLPSaveRestoreConnector as default in NLPModel.restore_from * Add restore_from_nemo_with_adapter and test script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rename t5 file and classes to be consistent with GPT * add t5 sft dataset * add support for single-file format with T5SFTDataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Various small changes to make T5 SFT work like GPT SFT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add adapter evaluation test script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MultiAdaterConfig for ia3 and fix builder issue * Make ptuning for T5SFTModel work using mixin * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add IA3_Adapter for AdapterName * Add adapter name for ptuning and attention adapter * Make test script GPT/T5 agnostic * Add layer selection feature * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Integrate adapter name and config * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt peft tuning script to new API * add t5 peft tuning script with new API * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix IA3 layer selection issue * Override state_dict on SFT model instead of mixin * Add load adapter by adapter config * move peft config map away from example script * auto get config from nemo adapter * Move PEFTConfig to new file * fix ckpt save/load for t5 * name change: add_adapters -> add_adapter * variable name change * update t5 script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix t5 issues * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add weight tying * update gpt tuning script * PEFT-API proposal * Fix according to comments * update tuning scripts * move merge_cfg_with to mixin class since it applies to both gpt and t5 and requires the model class for restore * Add mcore_gpt support for NLPAdapterMixin * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo * variable name change to distinguish "peft" and "adapter" * override `load_adapters` to support `add_adapter` name change * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update tuning and eval script for adapter save/load * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add Ptuning on first stage only * add lora tutorial for review * Fix layer selection for mcore * add landing page * fix resume training Signed-off-by: jasonwan <jasonwan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add mcore condition in sharded_state_dict to make sft work * Update lora_tutorial.md First edit of this file for PEFT documentation for NeMO Signed-off-by: hkelly33 <58792115+hkelly33@users.noreply.github.com> * rename Adapter to AttentionAdapter to avoid confusion in doc * Change load_adapters to load .nemo * add quick start guide * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add load_adapters with .ckpt * Remove setup_complete changes in load_adapters * update landing page * remove typo * Updated quick_start.md per Chen Cui Signed-off-by: hkelly33 <58792115+hkelly33@users.noreply.github.com> * Add inference config merger and tutorial * Add doc string for NLPAdapterModelMixin and deprecated warning on MegatronGPTPEFTModel * add suppor…
* add random samlpes (num_segments) with segment_duration for get_label(filename, segment_duration = 60*6, num_segments) --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
* Fix flash decoding precision Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [tutorial] fixed missing RIR scripts file. (#8257) * add values to en tts dict (#7879) * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names * Update build_transformer_config in Bert --------- * initial placeholder * add to intro/index.rst * initial content update * add diff images size * minor fixes * minor language change * clean changes --------- Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: Huiying Li <huiyingl@nvidia.com> Signed-off-by: Huiying Li <willwin.lee@gmail.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Huiying <willwin.lee@gmail.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <bobchen@nvidia.com> Co-authored-by: Huiying Li <huiyingl@nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com>
* Script for estimating Lhotse dynamic duration buckets Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Improve documentation Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Apply suggestions from code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* change default decoding to beam=1 and length up to decoder max len Signed-off-by: stevehuang52 <heh@nvidia.com> * add Canary support for return_hypotheses=True Signed-off-by: stevehuang52 <heh@nvidia.com> * change len_pen default to 0 Signed-off-by: stevehuang52 <heh@nvidia.com> * fix for nbest hypotheses Signed-off-by: stevehuang52 <heh@nvidia.com> * fix for return best hypo Signed-off-by: stevehuang52 <heh@nvidia.com> --------- Signed-off-by: stevehuang52 <heh@nvidia.com>
* [TTS] Add modules for mel spectrogram codec Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add mel band validation Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add fullband mel encoder and more documentation Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com>
…nightly Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
…DIA/NeMo into heh/modular_speechlm_nightly
github-actions
bot
added
core
Changes to NeMo Core
TTS
ASR
NLP
Speaker Tasks
CI
common
labels
Feb 22, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
clean up for specchllm