Skip to content

Allow users to specify custom master nodes for networking utils#339

Merged
kmontemayor2-sc merged 3 commits intomainfrom
kmonte/custom-main-pg-rank
Oct 1, 2025
Merged

Allow users to specify custom master nodes for networking utils#339
kmontemayor2-sc merged 3 commits intomainfrom
kmonte/custom-main-pg-rank

Conversation

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator

Scope of work done

We do this so in server/client mode we can have a "training" pg for both server and client, but have each server and client communicate with their respective leaders.

e.g for us to determine the ip/port on "training cluster leader" (global rank = 2).

image

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

/unit_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 26, 2025

GiGL Automation

@ 18:35:57UTC : 🔄 Unit Test started.

@ 19:17:50UTC : ✅ Workflow completed successfully.

Copy link
Copy Markdown
Collaborator

@svij-sc svij-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, how are global ranks determined?

Comment thread python/gigl/distributed/utils/networking.py
@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

btw, how are global ranks determined?

For the server / client mode I was going to compute them before cluster launch and set them as env vars.

But in practice it'll be [{storage cluster}. {compute cluster}] so storage master = 0, compute master = storage size

Copy link
Copy Markdown
Collaborator

@svij-sc svij-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM given comment is addressed.

Comment thread python/gigl/distributed/utils/networking.py Outdated
Comment thread python/gigl/distributed/utils/networking.py Outdated
@kmontemayor2-sc kmontemayor2-sc added this pull request to the merge queue Oct 1, 2025
Merged via the queue into main with commit 57aa093 Oct 1, 2025
4 checks passed
@kmontemayor2-sc kmontemayor2-sc deleted the kmonte/custom-main-pg-rank branch October 1, 2025 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants