Skip to content

[Feature]: Remove cupy dependency for multi-node Ray deployment #19758

Open
@stephanie-wang

Description

@stephanie-wang

🚀 The feature, motivation and pitch

Ray uses cupy under the hood for inter-GPU communication in compiled graphs. We should remove the dependency by creating the collective group using existing vllm APIs and providing a handle to the group to Ray.

Alternatives

  • Ray could use provide its own NCCL bindings - note this would not support non-NVIDIA GPUs

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

feature requestNew feature or requestrayanything related with ray

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions