Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing parallelisms #9725

Merged
merged 2 commits into from
Jul 17, 2024
Merged

Conversation

maanug-nv
Copy link
Collaborator

@maanug-nv maanug-nv commented Jul 13, 2024

What does this PR do ?

CP and EP config values were not being passed to model parallel initialization methods.

Collection: [llm]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

marcromeyn
marcromeyn previously approved these changes Jul 13, 2024
@maanug-nv maanug-nv changed the base branch from main to r2.0.0rc1 July 14, 2024 23:13
@maanug-nv maanug-nv dismissed marcromeyn’s stale review July 14, 2024 23:13

The base branch was changed.

@maanug-nv maanug-nv force-pushed the maanug/fix-missing-parallelisms branch from b3528ea to fc57e53 Compare July 14, 2024 23:15
@github-actions github-actions bot added core Changes to NeMo Core NLP CI Multi Modal labels Jul 14, 2024
@maanug-nv maanug-nv force-pushed the maanug/fix-missing-parallelisms branch from fc57e53 to 47927b5 Compare July 14, 2024 23:18
Signed-off-by: Maanu Grover <maanug@nvidia.com>
@maanug-nv maanug-nv force-pushed the maanug/fix-missing-parallelisms branch from 47927b5 to 999efcc Compare July 14, 2024 23:22
@github-actions github-actions bot removed core Changes to NeMo Core NLP CI Multi Modal labels Jul 14, 2024
@maanug-nv
Copy link
Collaborator Author

fixed base branch and target branch

ericharper
ericharper previously approved these changes Jul 15, 2024
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Signed-off-by: Maanu Grover <maanug@nvidia.com>
@maanug-nv maanug-nv force-pushed the maanug/fix-missing-parallelisms branch from 48c5346 to 047421f Compare July 15, 2024 19:42
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving.

@ericharper ericharper merged commit 5ff4f0b into r2.0.0rc1 Jul 17, 2024
118 checks passed
@ericharper ericharper deleted the maanug/fix-missing-parallelisms branch July 17, 2024 00:45
github-actions bot pushed a commit that referenced this pull request Jul 17, 2024
* pass cp and ep cfg to init mp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update test

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
@ko3n1g ko3n1g mentioned this pull request Jul 18, 2024
2 tasks
@akoumpa
Copy link
Member

akoumpa commented Jul 23, 2024

I'd like to add this to main, but instead of manual copy use a function to copy all config attributes that end with _paralle_size. Made main...akoumparouli/copy_parallel_size for this purpose.

akoumpa pushed a commit that referenced this pull request Jul 25, 2024
* pass cp and ep cfg to init mp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update test

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
akoumpa pushed a commit that referenced this pull request Jul 25, 2024
* pass cp and ep cfg to init mp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update test

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
akoumpa pushed a commit that referenced this pull request Jul 26, 2024
* pass cp and ep cfg to init mp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update test

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
akoumpa pushed a commit that referenced this pull request Jul 26, 2024
* pass cp and ep cfg to init mp



* update test



---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
BoxiangW pushed a commit to BoxiangW/NeMo that referenced this pull request Jul 30, 2024
* pass cp and ep cfg to init mp

* update test

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>
xuanzic pushed a commit to xuanzic/NeMo that referenced this pull request Aug 1, 2024
* pass cp and ep cfg to init mp

* update test

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Signed-off-by: Vivian Chen <xuanzic@example.com>
monica-sekoyan pushed a commit that referenced this pull request Oct 14, 2024
* pass cp and ep cfg to init mp



* update test



---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants