Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to CUDA 12 #1527

Merged
merged 5 commits into from
Nov 8, 2023
Merged

Upgrade to CUDA 12 #1527

merged 5 commits into from
Nov 8, 2023

Conversation

zhuohan123
Copy link
Collaborator

I can successfully build vllm with:

CUDA 12.2
Pytorch 2.1.0 (built with CUDA 12.1)
xformers 0.0.22.post7

Note that vllm cannot be successfully built on CUDA 12.1 because of this issue. This can be solved after pytorch upgrade its pybind dependecy.

@Tostino
Copy link
Contributor

Tostino commented Nov 1, 2023

I will double check, but pretty sure I built with 12.1 last week.

Will edit when I have a chance to look at my changes.

Edit:

(venv) user@pop-os:/media/user/Data/IdeaProjects/vllm$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
(venv) user@pop-os:/media/user/Data/IdeaProjects/vllm$ pip freeze
aiosignal==1.3.1
anyio==3.7.1
astroid==2.6.6
attrs==23.1.0
certifi==2023.7.22
charset-normalizer==3.3.1
click==8.1.7
cmake==3.27.7
dill==0.3.7
exceptiongroup==1.1.3
fastapi==0.104.0
filelock==3.12.4
frozenlist==1.4.0
fsspec==2023.10.0
h11==0.14.0
httptools==0.6.1
huggingface-hub==0.17.3
idna==3.4
importlib-metadata==6.8.0
iniconfig==2.0.0
isort==5.12.0
Jinja2==3.1.2
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
lazy-object-proxy==1.9.0
lit==17.0.3
MarkupSafe==2.1.3
mccabe==0.6.1
mpmath==1.3.0
msgpack==1.0.7
mypy==0.991
mypy-extensions==1.0.0
networkx==3.2
ninja==1.11.1.1
numpy==1.26.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.52
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
packaging==23.2
pandas==2.1.2
platformdirs==3.11.0
pluggy==1.3.0
protobuf==4.24.4
psutil==5.9.6
py==1.11.0
pyarrow==13.0.0
pydantic==1.10.13
pylint==2.8.2
pytest==7.4.3
pytest-asyncio==0.21.1
pytest-forked==1.6.0
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3.post1
PyYAML==6.0.1
ray==2.7.1
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
rpds-py==0.10.6
safetensors==0.4.0
sentencepiece==0.1.99
six==1.16.0
sniffio==1.3.0
starlette==0.27.0
sympy==1.12
tokenizers==0.14.1
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.1
torch==2.1.0
tqdm==4.66.1
transformers==4.34.1
triton==2.1.0
types-PyYAML==6.0.12.12
types-requests==2.31.0.10
types-setuptools==68.2.0.0
typing_extensions==4.8.0
tzdata==2023.3
urllib3==2.0.7
uvicorn==0.23.2
uvloop==0.19.0
-e git+https://github.com/Tostino/vllm.git@11409d16190f420a7cb6254b4cea8b4594d6ec4e#egg=vllm
watchfiles==0.21.0
websockets==12.0
wrapt==1.12.1
xformers==0.0.22.post7
yapf==0.32.0
zipp==3.17.0
(venv) user@pop-os:/media/user/Data/IdeaProjects/vllm$ 


@WoosukKwon WoosukKwon mentioned this pull request Nov 2, 2023
3 tasks
@WoosukKwon
Copy link
Collaborator

I think we have to check the following items before the release:

  1. Does our GitHub workflow work for CUDA 12.2?
  2. If you compile our package with CUDA 12.2, does it work when the user’s NVIDIA driver version is < 12.2 (e.g., like in colab)?
  3. Bump up the NVIDIA PyTorch docker image version.

@casper-hansen
Copy link
Contributor

casper-hansen commented Nov 7, 2023

Hi vLLM maintainers. I suggest maintaining compatibility with torch 2.0.1 and CUDA 11.8.0 for a few more versions. The way this would work is that you create two versions of the wheel:

  • PyPi: torch 2.1.0 and CUDA 12.1.1 wheel
  • GitHub release: additional torch 2.0.1 and CUDA 11.8.0 wheel

The idea is that people can still install vLLM from the GitHub release if they do not have the latest CUDA version yet. This can be achieved by the following:

VLLM_VERSION = "0.2.2"
PYPI_BUILD = os.getenv("PYPI_BUILD", "0") == "1"

if not PYPI_BUILD:
    try:
        CUDA_VERSION = "".join(os.environ.get("CUDA_VERSION", torch.version.cuda).split("."))[:3]
        VLLM_VERSION += f"+cu{CUDA_VERSION}"
    except Exception as ex:
        raise RuntimeError("Your system must have an Nvidia GPU for installing vLLM")

In the GitHub workflow, add a conditional on which and check if the current CUDA version being used to build is the same as the one you want to release on PyPi.

if ( $env:CUDA_VERSION -eq $env:PYPI_CUDA_VERSION ){
    $env:PYPI_BUILD = 1
}

For reference, you can look at setup.py and build.yaml in AutoAWQ.
https://github.com/casper-hansen/AutoAWQ/blob/main/.github/workflows/build.yaml

requirements.txt Outdated Show resolved Hide resolved
@WoosukKwon
Copy link
Collaborator

WoosukKwon commented Nov 8, 2023

@zhuohan123 Let's merge this PR for the ease of development?

TODOs (in the next PR):

@zhuohan123 zhuohan123 merged commit 06458a0 into main Nov 8, 2023
2 checks passed
xjpang pushed a commit to xjpang/vllm that referenced this pull request Nov 13, 2023
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@WoosukKwon WoosukKwon deleted the upgrade-to-cuda-12 branch November 17, 2023 00:29
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants