-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to CUDA 12 #1527
Upgrade to CUDA 12 #1527
Conversation
I will double check, but pretty sure I built with 12.1 last week. Will edit when I have a chance to look at my changes. Edit:
|
I think we have to check the following items before the release:
|
Hi vLLM maintainers. I suggest maintaining compatibility with torch 2.0.1 and CUDA 11.8.0 for a few more versions. The way this would work is that you create two versions of the wheel:
The idea is that people can still install vLLM from the GitHub release if they do not have the latest CUDA version yet. This can be achieved by the following: VLLM_VERSION = "0.2.2"
PYPI_BUILD = os.getenv("PYPI_BUILD", "0") == "1"
if not PYPI_BUILD:
try:
CUDA_VERSION = "".join(os.environ.get("CUDA_VERSION", torch.version.cuda).split("."))[:3]
VLLM_VERSION += f"+cu{CUDA_VERSION}"
except Exception as ex:
raise RuntimeError("Your system must have an Nvidia GPU for installing vLLM") In the GitHub workflow, add a conditional on which and check if the current CUDA version being used to build is the same as the one you want to release on PyPi.
For reference, you can look at |
@zhuohan123 Let's merge this PR for the ease of development? TODOs (in the next PR):
|
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
I can successfully build vllm with:
CUDA 12.2
Pytorch 2.1.0 (built with CUDA 12.1)
xformers 0.0.22.post7
Note that vllm cannot be successfully built on CUDA 12.1 because of this issue. This can be solved after pytorch upgrade its pybind dependecy.