Open
Description
🚀 The feature, motivation and pitch
Ray uses cupy under the hood for inter-GPU communication in compiled graphs. We should remove the dependency by creating the collective group using existing vllm APIs and providing a handle to the group to Ray.
Alternatives
- Ray could use provide its own NCCL bindings - note this would not support non-NVIDIA GPUs
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Type
Projects
Status
Backlog