-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug - Fix issue that the root mpi rank may not be the first in the hostfile #270
Conversation
Codecov Report
@@ Coverage Diff @@
## release/0.4 #270 +/- ##
===============================================
+ Coverage 87.80% 87.82% +0.01%
===============================================
Files 75 75
Lines 4364 4368 +4
===============================================
+ Hits 3832 3836 +4
Misses 532 532
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>
…stfile (#270) **Description** Launch mpi on the sorted first host in the hostfile.
__Description__ Cherry-pick bug fixes from v0.4.0 to main. __Major Revisions__ * Bug - Fix issues for Ansible and benchmarks (#267) * Tests - Refine test cases for microbenchmark (#268) * Bug - Build openmpi with ucx support in rocm dockerfiles (#269) * Benchmarks: Fix Bug - Fix fio build issue (#272) * Docs - Unify metric and add doc for cublas and cudnn functions (#271) * Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274) * Bug - Fix bug of detecting if gpu_index is none (#275) * Bug - Fix bugs in data diagnosis (#273) * Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270) * Benchmarks: Configuration - Update inference and network benchmarks in configs (#276) * Docs - Upgrade version and release note (#277) Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>
Description
Launch mpi on the sorted first host in the hostfile.