Skip to content

File Limit Request: vllm - 400 MiB #3792

@youkaichao

Description

@youkaichao

Project URL

https://pypi.org/project/vllm/

Does this project already exist?

  • Yes

New Limit

400

Update issue title

  • I have updated the title.

Which indexes

PyPI

About the project

vLLM is a fast and easy-to-use library for LLM inference and serving.

It plans to ship nvidia-nccl-cu12==2.18.3 within the package.

Reasons for the request

We identified nccl>=2.19 with a bug that largely increased GPU memory overhead, so we have to pin and ship nccl versions ourselves.

We cannot use pip install nvidia-nccl-cu12==2.18.3 because we depend on torch, which has binary dependency with pip install nvidia-nccl-cu12==2.19.5. So we are in a dependency hell, and we have to keep a nccl library ourselves.

vllm is a popular library for LLM inference, and it is used by many tech companies. Shipping nccl with vllm can increase its throughput and the quality of LLM serving. However, the downside is that the package wheel will become much larger. So we have to come here for support, to ask for a larger file size limit.

Code of Conduct

  • I agree to follow the PSF Code of Conduct

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions