Skip to content

[core][cgraph] Export classes related to NCCL communicator #54117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ruisearch42
Copy link
Contributor

Why are these changes needed?

These classes need to exported to implement external communicators.
See vllm-project/vllm#19816

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
@Copilot Copilot AI review requested due to automatic review settings June 26, 2025 00:27
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR exports classes related to the NCCL communicator to support external communicators. The changes include:

  • Updating the recv function documentation in nccl_group.py with new parameters (shape, dtype, allocator).
  • Exporting AcceleratorContext and TorchTensorAllocator in init.py as part of the public API.

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
python/ray/experimental/channel/nccl_group.py Updated the function documentation to include new parameters for tensor reception.
python/ray/experimental/channel/init.py Modified imports and export list to include new accelerator-related classes.
Comments suppressed due to low confidence (2)

python/ray/experimental/channel/nccl_group.py:201

  • Ensure that the function signature has been updated to include the new parameters 'shape', 'dtype', and 'allocator', and that these parameters are correctly handled in the function logic.
            shape: The shape of the tensor to receive.

python/ray/experimental/channel/init.py:14

  • Verify that exposing 'AcceleratorContext' and 'TorchTensorAllocator' aligns with the overall module design and that they are implemented appropriately to maintain a consistent public API.
from ray.experimental.channel.accelerator_context import AcceleratorContext

Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
@stephanie-wang
Copy link
Contributor

Sorry for the delay in review.

Can you add a test to show example usage? E.g., create a torch.distributed group manually and then wrap it with the Ray interfaces?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants