-
Notifications
You must be signed in to change notification settings - Fork 6.6k
[core][gpu-objects] Fix test_gpu_objects_nccl.py
#53874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core][gpu-objects] Fix test_gpu_objects_nccl.py
#53874
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses an issue with the NCCL GPU objects test by modifying resource allocation in the actor decorator and adjusting test parameters.
- Updated the remote decorator to explicitly set CPU usage to zero.
- Added parameterization to ensure the test runs with two GPUs.
Comments suppressed due to low confidence (2)
python/ray/tests/test_gpu_objects_nccl.py:8
- Consider adding a brief comment explaining the rationale for setting 'num_cpus' to 0, to clarify the resource usage for this GPU-specific actor.
@ray.remote(num_gpus=1, num_cpus=0)
python/ray/tests/test_gpu_objects_nccl.py:18
- Ensure that the CI environment or local setups are configured to support at least 2 GPUs to avoid false negatives in this test.
@pytest.mark.parametrize("ray_start_regular", [{"num_gpus": 2}], indirect=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
#53871 * CI error:  Closes #53871 Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
ray-project#53871 * CI error:  Closes ray-project#53871 Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Signed-off-by: Scott Lee <scott.lee@rebellions.ai>
ray-project#53871 * CI error:  Closes ray-project#53871 Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
#53871 * CI error:  Closes #53871 Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Why are these changes needed?
#53871
Related issue number
Closes #53871
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.