Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

launcher/multinode_runner.py: mapping env variables #3372

Merged
merged 8 commits into from May 5, 2023

Conversation

YizhouZ
Copy link
Contributor

@YizhouZ YizhouZ commented Apr 25, 2023

launcher/multinode_runner.py: mapping env variables in running cmd for mpich runner.

Previously, launching deepspeed with mpich could not properly set env variables like "RANK", "LOCAL_RANK", "WORLD_SIZE" and "LOCAL_SIZE", which deepspeed would use. They would be different names like "PMI_RANK".

Thus, we consider to set them by -genv / -env as the mpirun args. The "-genv" is used to set general env variables like "WORLD_SIZE", while the "-env" is used to set rank specific env variables like "RANK" and "LOCAL_RANK".

To simply demonstrate my change, below is an example of running cmd, only using 2 ranks:
[INFO] [runner.py:540:main] cmd = mpirun -genv PYTHONSTARTUP=/.../pythonstart -genv PYTHONPATH=/../ -genv MASTER_ADDR xxx -genv MASTER_PORT xxx -genv WORLD_SIZE 2 -genv LOCAL_SIZE 2 -n 1 -host xxx -env RANK 0 -env LOCAL_RANK 0 /../bin/python -u pretrain_gpt.py ... : -n 1 -host xx -env RANK 1 -env LOCAL_RANK 1 /../bin/python -u pretrain_gpt.py ...

@YizhouZ YizhouZ changed the title launcher/multinode_runner.py: mapping env variables in running cmd fo… launcher/multinode_runner.py: mapping env variables Apr 25, 2023
@YizhouZ
Copy link
Contributor Author

YizhouZ commented Apr 26, 2023

@loadams Hi, could you please help me trigger the CI? My CLA was reviewed and passed today. Thank you!

@tjruwase tjruwase merged commit 4e886f0 into microsoft:master May 5, 2023
18 checks passed
@YizhouZ YizhouZ deleted the yizhou/mpich_runner branch November 8, 2023 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants