Build Custom CUDA Kernel with PyTorch 2.x #106

realCrush · 2023-05-30T03:16:37Z

Hi, this is an awesome repo, however I tried to build the Custom CUDA Kernel but failed, here are some details.

My hardware is NVIDIA GeForce RTX 4090, I first tried PyTorch 1.13.1 and CUDA 11.7 as you suggested, and I successfully build the Custom CUDA Kernel and import the SSMKernelDPLR, however when I run the forward(), I got error CUFFT_INTERNAL_ERROR, I checked PyTorch GitHub issue and found PyTorch FFT on 4090 needs CUDA 11.8, and PyTorch stable channel only provide PyTorch 2.x + CUDA 11.8 (although in nightly channel they provide PyTorch 1.13.1 + CUDA 11.8, and I downloaded, it cause much more package conflict), thus I upgrade PyTorch to 2.0.1.

With PyTorch 2.0.1 + CUDA 11.8, the FFT error got fixed, however when I tried to bulid your Custom CUDA Kernel, I got error:

$ python setup.py install
running install
/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer, pypa/build or
        other standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer, pypa/build or
        other standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
creating structured_kernels.egg-info
writing structured_kernels.egg-info/PKG-INFO
writing dependency_links to structured_kernels.egg-info/dependency_links.txt
writing top-level names to structured_kernels.egg-info/top_level.txt
writing manifest file 'structured_kernels.egg-info/SOURCES.txt'
/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'structured_kernels.egg-info/SOURCES.txt'
writing manifest file 'structured_kernels.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no g++ version bounds defined for CUDA version 11.8
  warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'structured_kernels' extension
creating build
creating build/temp.linux-x86_64-cpython-39
gcc -pthread -B /home/ycsong/anaconda3/envs/test/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /home/ycsong/anaconda3/envs/test/include -I/home/ycsong/anaconda3/envs/test/include -fPIC -O2 -isystem /home/ycsong/anaconda3/envs/test/include -fPIC -I/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include -I/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/TH -I/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/THC -I/home/ycsong/anaconda3/envs/test/include -I/home/ycsong/anaconda3/envs/test/include/python3.9 -c cauchy.cpp -o build/temp.linux-x86_64-cpython-39/cauchy.o -g -march=native -funroll-loops -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=structured_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
In file included from /home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/c10/cuda/CUDAGraphsC10Utils.h:3,
                 from /home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/c10/cuda/CUDACachingAllocator.h:4,
                 from /home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/c10/cuda/impl/CUDAGuardImpl.h:9,
                 from /home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/c10/cuda/CUDAGuard.h:7,
                 from cauchy.cpp:5:
/home/ycsong/anaconda3/envs/test/lib/python3.9/site-packages/torch/include/c10/cuda/CUDAStream.h:6:10: fatal error: cuda_runtime_api.h: No such file or directory
    6 | #include <cuda_runtime_api.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1

I've checked my g++ version, it's gcc version 9.4.0, and I also checked CUDA_Compilers, it says CUDA 11.8 is compatible with g++ version 6.0-11.2.1, thus the warning UserWarning: There are no g++ version bounds defined for CUDA version 11.8 is wired.

My scripts:

conda create -n test python=3.9
conda activate test
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc
cd extensions/kernels
python setup.py install

Could you please help me fix this problem? or it's not possible to build Custom CUDA Kernel with PyTorch 2.x for now?

The text was updated successfully, but these errors were encountered:

realCrush · 2023-05-30T05:35:49Z

I fixed the problem by installing the cudatoolkit: conda install cuda-toolkit -c "nvidia/label/cuda-11.8.0"

realCrush closed this as completed May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Custom CUDA Kernel with PyTorch 2.x #106

Build Custom CUDA Kernel with PyTorch 2.x #106

realCrush commented May 30, 2023 •

edited

Loading

realCrush commented May 30, 2023

Build Custom CUDA Kernel with PyTorch 2.x #106

Build Custom CUDA Kernel with PyTorch 2.x #106

Comments

realCrush commented May 30, 2023 • edited Loading

realCrush commented May 30, 2023

realCrush commented May 30, 2023 •

edited

Loading