Why use tensor parallelism when model can easily fit on a single GPU ? #294

vikigenius · 2023-06-28T15:25:00Z

vikigenius
Jun 28, 2023

If the model can fit on a single GPU, wouldn't it be better to use something like DDP instead? What are the advantages of using tensor parallelism if the model is small enough to fit on a single GPU ?

Answered by zhuohan123

Jun 28, 2023

You are right, for small models, you should just use one GPU. You can start multiple vLLM replicas to achieve "data parallelism" for serving, so that is not shown in our code. Tensor parallelism is mainly for large models that cannot fit a single GPU.

View full answer

zhuohan123 · 2023-06-28T15:32:01Z

zhuohan123
Jun 28, 2023
Maintainer

You are right, for small models, you should just use one GPU. You can start multiple vLLM replicas to achieve "data parallelism" for serving, so that is not shown in our code. Tensor parallelism is mainly for large models that cannot fit a single GPU.

3 replies

nivibilla Jul 10, 2023

How would I go about doing this? Does something as simple as torch.cuda.set_device() work? Can I have 8 processes with each one on a different gpu?

gesanqiu Jul 19, 2023

Just start a vLLM server(Engine) on each GPU will work fine.

pri2si17-1997 Sep 13, 2023

Hi @gesanqiu do you have any code example related to starting vllm replicas? That would be helpful. I am trying but it doesn't seem to work.

pgarec · 2023-09-21T15:28:43Z

pgarec
Sep 21, 2023

Model parallelism in serving scenarios can be indeed beneficial when serving multiple models at the same time (https://arxiv.org/abs/2302.11665), but in case of serving a single model then you should stick to one GPU

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why use tensor parallelism when model can easily fit on a single GPU ? #294

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Why use tensor parallelism when model can easily fit on a single GPU ? #294

vikigenius Jun 28, 2023

Replies: 2 comments · 3 replies

zhuohan123 Jun 28, 2023 Maintainer

nivibilla Jul 10, 2023

gesanqiu Jul 19, 2023

pri2si17-1997 Sep 13, 2023

pgarec Sep 21, 2023

vikigenius
Jun 28, 2023

Replies: 2 comments 3 replies

zhuohan123
Jun 28, 2023
Maintainer

pgarec
Sep 21, 2023