-
Notifications
You must be signed in to change notification settings - Fork 1.4k
2507 Add Horovod unit tests #2519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Nic Ma <nma@nvidia.com>
80ad38e to
8e07681
Compare
|
Hi @IsaacYangSLA and @wyli , I am trying to add distributed tests based on Horovod environment, but I got a question here: Thanks in advance. |
|
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
|
Hi @wyli @IsaacYangSLA , or maybe let's just put this Horovod test in the tests to manually run locally first, then try to integrate into CI later when we have more Horovod tests? Just like what we did for the PyTorch distributed tests before? Thanks. |
|
To run this new Horovod test locally, just follow:
Thanks. |
|
@IsaacYangSLA could you please help review? |
Fixes #2507 .
Description
This PR added the Horovod unit tests for distributed data parallel.
Status
Ready
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests.make htmlcommand in thedocs/folder.