-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CUDA 12.4 #104417
Comments
PyTorch should already support CUDA 12.2 and you should be able to build form source with it. |
Does the current (-git?) pytorch code allow for taking advantage of HMM? |
But I can't pip install pytorch with CUDA 12.2, right? |
i have CUDA 12.2 and i can't use pytorch =/ |
Any updates? |
Also, some of the tests are failing for 12.2.1, in case anyone is running them for their own builds. Could be informative. |
@ptrblck same |
@ptrblck Sure. Pytorch will build with cuda 12.2. The request was more about integrating with Heterogenous Memory Management. HMM enables the consumption of system memory from within the GPU. The last time I looked, the implementation of HMM could have been better done by NVIDIA, causing stalling because tremendous amounts of work were done in top half interrupt handlers. Pytorch would need some modification to treat system memory as GPU memory directly. |
You can try this build which supports CUDA 12.2. #91122 |
@sanjibnarzary Thank you for the suggestion. I think there may be two issues, although I have not tried.
For our community, Artificial Wisdom, I have spent a nontrivial amount of time building the following: I built Pytorch because Kineto is not default-enabled. The Pytorch build is incomplete. However, faiss works like a champ. If you can share the PR that activated HMM, I'll look at building with it. Thank you for your contributions, |
hi @sdake Can you check if it is of your use Colossal-AI. As per their documentation "they implemented a dynamic heterogeneous memory management system named Gemini unlike traditional implementation which adopt static memory partition".
This will install both the colossalai package as well as the ColossalAIStrategy for the Lightning Trainer:
You can tune several settings by instantiating the strategy objects and pass options in:
See a full example of a benchmark with the a GPT-2 model of up to 24 billion parameters |
@sanjibnarzary huge fan of their work. We are not currently using their work. Thank you, |
I did the tutorial for jetson AGX Orin: |
Again, forgive the break in protocol. Thanks @johnnynunez. Super work!! My next step in my pytorch PR was to use a theme on your work, where you are exporting the environment variables and building wheels with python setup.py bdist_wheel in your excellent guide. I have been a systems engineer forever, although I only have a little experience integrating
Please note, that T_(env variable) are typed variables. I have searched and searched, but you have given me some new things to search for (praise be the day when we can ask, not search). I use cmake to configure, ninja as a build generator, and then I attempt to use python setup.py to build a wheel. I prefer this workflow, although if it doesn't work, it doesn't work. Are you aware of how I could make this type of workflow function? (As in build a wheel based upon preexissting cmake configuration and ninja build output?). Thank you so much for your time. I hope, even if the answer is no, you may find some interest in the PR or the problem in general, given the detailed nature of your blog on the topic! Thank you, |
I had RTX 3090 but I sold it... I can't help you right now, but I found this: https://medium.com/@zhanwenchen/build-pytorch-from-source-with-cuda-12-2-1-with-ubuntu-22-04-b5b384b47ac I think that magma122 is out, you can check it and skip it |
Hi @johnnynunez! I'm the author of the Medium article you referenced. I have encountered a few problems with my magma-cuda122 build on a system that I can no longer access. I'm retracing my own steps on a new server I just finished building for myself with 3090 Ti. This work should be done this week. Whenever it's ready, I'll open a PR on the pytorch/builder repo and record the steps in an update to my Medium article. I don't think magma-cuda122 is out yet: https://anaconda.org/search?q=magma-cuda12 |
The fact is that I guess pytorch will need extra code to implement HMM. I guess we will be informed maybe in a next release. @ptrblck |
I bought RTX 4090. How is it going this? I will build |
Is this compatible with HMM? https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-09.html @ptrblck |
Any news? |
Is there any schedule in officially support CUDA 12.2? |
Pytorch 2.2 directly to 12.4 |
when will the version of pytorch that supports cuda 12.4 be released? |
cuda 12.4 is go out this month. |
you don't have a date then ? |
Depends of the owners from pytorch but they are working if you see PRs that they upgrading cudnn |
Cuda 12.4 is out |
For whatever it's worth, I installed Cuda 12.4 and tested against the pre-compiled torch vision wheels from Cuda 12.1 ( |
I have a 4080 super and pythorc still doesn't support cuda 12.4, I can't underversion drivers because it doesn't support them same error with nightly version |
Did you compile it manually? git clone --recursive --branch v2.2.1 http://github.com/pytorch/pytorch
export USE_NCCL=1 && \
export USE_QNNPACK=0 && \
export USE_PYTORCH_QNNPACK=0 && \
export USE_NATIVE_ARCH=1 && \
export USE_DISTRIBUTED=1 && \
export USE_TENSORRT=0 && \
export TORCH_CUDA_ARCH_LIST="8.9"
export PYTORCH_BUILD_VERSION=2.2.1 && \
export PYTORCH_BUILD_NUMBER=1
export MAKEFLAGS="-j$(nproc)" cd pytorch
pip install -U -r requirements.txt
pip install -U scikit-build
pip install -U ninja
pip3 install libstdcxx-ng=12
pip install -U cmake
python setup.py bdist_wheel |
This thread quite diverged into random discussions misunderstanding how pre-built PyTorch binaries work instead of a CUDA 12.4 tracking issue. In any case:
Your locally installed CUDA toolkit won't be used unless you install PyTorch from source or a custom CUDA extension. You would need to install an NVIDIA driver to execute workloads via the PyTorch binaries.
PyTorch does support CUDA 12.4 and you can build it from source while we are updating the binary build process. |
So finally, binaries from pytorch will be with 12.4? |
I simply installed pytorch using the command in the home page, using pip. |
Pytorch comes with precompiled cuda and everything needed to run on gpus. That's why pytorch binaries come with cuda 11.8 or cuda 12.1. To use the latest version of cuda, you need to compile pytorch from source. |
git clone --recursive --branch v2.2.1 http://github.com/pytorch/pytorch export PYTORCH_BUILD_VERSION=2.2.1 && cd pytorch using this method? @johnnynunez |
if you want use pre-compiled is: pip3 install -U torch torchvision torchaudio |
but will using this method enable me to use the GPU with Cuda 12.4? |
yes of course |
I have tried both methods but nothing, I will wait for a direct pip @johnnynunez |
finally worked, the steps are to create a new virtual environment and copy the pip from the site's home page. |
@Vins33 I would need your help to elaborate the step by step configuration. i am suffering from the same pain |
@smsaqlain first you create a new env with python -m venv venv , then you install using pip with cuda 12.1. I'm using Python 3.12 but 3.11 shouldn't cause any problems either |
Hello all, I had the same problem myself. I am posting this to hopefully help anyone with a similar issue. For context, I'm running an Nvidia 4070 Ti Super GPU on my Windows workstation PC which has CUDA 12.4. This is supposed to be the latest installation. I'm using Ubuntu 22.04 as well, so I am running in WSL2. Now, the problem was that I've tried pip uninstalling and reinstalling PyTorch to no avail. Every time I try running PyTorch in Python, I would get this error: >>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/.local/lib/python3.10/site-packages/torch/__init__.py", line 237, in <module>
from torch._C import * # noqa: F403
ImportError: /home/user/.local/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister I am aware that at the moment, PyTorch was built for CUDA 12.1, but I've got it to work after some hours of troubleshooting. Here is what ultimately worked for me:
sudo pip3 uninstall -y torch torchvision torchaudio
pip3 uninstall -y torch torchvision torchaudio
pip3 cache purge
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 At the time of writing, I am running on CUDA 12.4 with PyTorch working now. Here's what it might look like: import torch
import torchvision
import torchaudio
print(torch.__version__)
print(torchvision.__version__)
print(torchaudio.__version__)
print(torch.cuda.is_available()) Output:
Wishing everyone the best! And hopefully PyTorch would provide a stable version for CUDA 12.4 users. Happy coding. |
Any ETA for cu124 wheels? |
Thank you so much! I'm using CUDA 12.4 and this work for me perfectly :) |
I was seeing |
OSError: [WinError 126] The specified module could not be found............torch\lib\shm.dll" or one of its dependencies. Getting this error after your steps @dominicklee |
For PyTorch 2.2.2, you can use CUDA 12.4 but not cuDNN 9 (8 is fine). There are a few modifications you need to make to the sources (I'm not on my Ubuntu machine, so I will update this post later). By the way, I never quite finished the magma-cuda123 build. I tried to do 2.7.1, but it turns out I don't know enough about MAGMA to know what to monkey-patch. |
This is really the behavior I have, with cudnn 9 I have not been able to make it work. |
All is working with torch 2.3.0. |
I get:
It knows I have CUDA 12.4, so why can't it find CUDA? |
Hi Folks, I am installing apex for mixed precision training. My machine has CUDA 12.4, I have installed torch=2.3.0 but is pre-compiled with CUDA 12.1. While installing apex, I get the following error. Using pip 24.0 from /homes/hayatu/miniconda3/envs/focal/lib/python3.8/site-packages/pip (python 3.8) torch.version = 2.3.0+cu121 running dist_info torch.version = 2.3.0+cu121 Compiling cuda extensions with Traceback (most recent call last): × Building wheel for apex (pyproject.toml) did not run successfully. note: This error originates from a subprocess, and is likely not a problem with pip. Will truly appreciate your help! |
🚀 The feature, motivation and pitch
Interesting feature:
This release introduces Heterogeneous Memory Management (HMM), allowing seamless sharing of data between host memory and accelerator devices. HMM is supported on Linux only and requires a recent kernel (6.1.24+ or 6.2.11+).
Alternatives
No response
Additional context
No response
cc @ptrblck
The text was updated successfully, but these errors were encountered: