-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load lang id any2any training mengruw #8961
Load lang id any2any training mengruw #8961
Conversation
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
* [Temp] VP Fixes Signed-off-by: smajumdar <titu1994@gmail.com> * Revert logging Signed-off-by: smajumdar <titu1994@gmail.com> --------- Signed-off-by: smajumdar <titu1994@gmail.com> (cherry picked from commit b6f46a0)
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
* check for first or last stage Signed-off-by: ericharper <complex451@gmail.com> * remove redundant check Signed-off-by: ericharper <complex451@gmail.com> * fix typo Signed-off-by: ericharper <complex451@gmail.com> * add map_location Signed-off-by: ericharper <complex451@gmail.com> --------- Signed-off-by: ericharper <complex451@gmail.com>
* Bug fix to restore act ckpt Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Bug fix to reset sequence parallelism Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * Update seq par reset/restore Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * Add nested loop Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> --------- Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
…ng (NVIDIA#6744) * fix checkpointed forward and add test for full activation checkpointing Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * add method Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * add method Signed-off-by: Abhinav Khattar <aklife97@gmail.com> --------- Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
* add call to p2p overlap Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * update Jenkins for test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> --------- Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* fix get param Signed-off-by: ericharper <complex451@gmail.com> * change name Signed-off-by: ericharper <complex451@gmail.com> --------- Signed-off-by: ericharper <complex451@gmail.com>
* initial POC for LDDL Bert * Finish LDDL POC * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix merge head * resolving merge * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for val/test loaders * change to new LDDL class + add winding * fix logging level * fix winding * test fix * fixes to winding * add file system * add prepemption optimizations * more logging * more prints * better logging * asfsf * add barrier * removing prints * working with mb lddl loader * final changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update requirements file with LDDL Signed-off-by: wdykas <wdykas@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert adding to requirements --------- Signed-off-by: wdykas <wdykas@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <complex451@gmail.com>
NVIDIA#6740) * Construct FP8 amax reduction group Signed-off-by: Tim Moon <tmoon@nvidia.com> * update core for CI Signed-off-by: Abhinav Khattar <aklife97@gmail.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
…#6780) * add interfaces for tp_communication overlap [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Interface to provide custom userbuffer communicator settings by yaml file [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Construct MPI process group for userbuffers support Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
* Fix TTS adapter tutorial Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu> * Fix version Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu> --------- Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* add trainer.validate example Signed-off-by: ericharper <complex451@gmail.com> * clean up white space Signed-off-by: ericharper <complex451@gmail.com> * add mbs and gbs to the config Signed-off-by: ericharper <complex451@gmail.com> --------- Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
* add model pretraining and customization classes Signed-off-by: ericharper <complex451@gmail.com> * fix Signed-off-by: ericharper <complex451@gmail.com> * test width Signed-off-by: ericharper <complex451@gmail.com> * increase middle pane width Signed-off-by: ericharper <complex451@gmail.com> * add modules and datasets Signed-off-by: ericharper <complex451@gmail.com> * remove global in t5 dataset s and fix formatting in megatron base model Signed-off-by: ericharper <complex451@gmail.com> --------- Signed-off-by: ericharper <complex451@gmail.com>
* Apply garbage collection inverval to validation steps Signed-off-by: Sangkug Lym <slym@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Sangkug Lym <slym@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: ericharper <complex451@gmail.com>
…ds from .yaml file
Hi, @Davood-M I suggested some fix for the language ids mis-match issue when training any2any NMT model. Could you please help to review and merge it to a proper branch? We need the code for April release. Thank you in advance! |
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
Jenkins CI
To run Jenkins, a NeMo User with write access must comment
jenkins
on the PR.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information