[Feature]: automatically select distributed inference backend #4955

youkaichao · 2024-05-21T16:30:36Z

🚀 The feature, motivation and pitch

ray is kind of an overkill for single gpu case, but is currently the only choice for multi-node inference.

we can add an auto backend, that checks the world size and the number of gpus available in the node, if this fits within this node, we can use multiprocessing, otherwise we can use ray.

this will help performance a lot.

@njhill do you have any bandwidth for this?

Alternatives

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

Yard1 · 2024-05-21T21:57:51Z

this will help performance a lot.

Given that we have almost completely replaced Ray communication layer for data (and soon for control), I doubt there will be a huge performance difference between multiprocessing and Ray at this point. But it would be good to benchmark.

njhill · 2024-05-22T00:13:19Z

@youkaichao yep I had been intending to do this as soon as I had a chance, should be able to tomorrow.

Also update docs to reflect support for the multiprocessing distributed executor. Resolves vllm-project#4955 Resolves vllm-project#5221

njhill · 2024-06-03T21:54:53Z

@youkaichao I have opened #5230, PTAL.

youkaichao added the feature request label May 21, 2024

njhill added a commit to njhill/vllm that referenced this issue Jun 3, 2024

[Core][Doc] Default to multiprocessing for single-node distributed case

89fe0f5

Also update docs to reflect support for the multiprocessing distributed executor. Resolves vllm-project#4955 Resolves vllm-project#5221

njhill mentioned this issue Jun 3, 2024

[Core][Doc] Default to multiprocessing for single-node distributed case #5230

Merged

njhill self-assigned this Jun 3, 2024

simon-mo closed this as completed in #5230 Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: automatically select distributed inference backend #4955

[Feature]: automatically select distributed inference backend #4955

youkaichao commented May 21, 2024

Yard1 commented May 21, 2024 •

edited

Loading

njhill commented May 22, 2024

njhill commented Jun 3, 2024

[Feature]: automatically select distributed inference backend #4955

[Feature]: automatically select distributed inference backend #4955

Comments

youkaichao commented May 21, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

Yard1 commented May 21, 2024 • edited Loading

njhill commented May 22, 2024

njhill commented Jun 3, 2024

Yard1 commented May 21, 2024 •

edited

Loading