Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run fails (CUDA error) - Linux/RTX3070 #411

Open
darrellenns opened this issue Jun 9, 2021 · 10 comments
Open

Run fails (CUDA error) - Linux/RTX3070 #411

darrellenns opened this issue Jun 9, 2021 · 10 comments
Labels
hardware support Hardware not supported or incompatible linux

Comments

@darrellenns
Copy link

Describe the bug
Running via run.sh fails with the following CUDA error:

RuntimeError: CUDA error: no kernel image is available for execution on the device

This is on up to date Arch Linux with an RTX 3070. The system has CUDA 11.3.0 installed, which works fine for other applications.

To Reproduce

  • Do a conda-based install as per the instruction (download weights, run install.sh)
  • Execute run.sh

Info (please complete the following information):

  • OS (e.g., Linux): Arch Linux
  • GPU model: RTX 3070
  • nvidia-smi -L: GPU 0: NVIDIA GeForce RTX 3070 (UUID: GPU-cdeba0c4-12c1-aeb9-d03f-eec971e8944b)
$conda info
     active environment : base
    active env location : /home/user/miniconda3
            shell level : 1
       user config file : /home/user/.condarc
 populated config files : 
          conda version : 4.10.1
    conda-build version : not installed
         python version : 3.9.1.final.0
       virtual packages : __cuda=11.3=0
                          __linux=5.12.9=0
                          __glibc=2.33=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/user/miniconda3  (writable)
      conda av data dir : /home/user/miniconda3/etc/conda
  conda av metadata url : https://repo.anaconda.com/pkgs/main
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/user/miniconda3/pkgs
                          /home/user/.conda/pkgs
       envs directories : /home/user/miniconda3/envs
                          /home/user/.conda/envs
               platform : linux-64
             user-agent : conda/4.10.1 requests/2.25.1 CPython/3.9.1 Linux/5.12.9-arch1-1 arch/rolling glibc/2.33
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False
$conda list
# packages in environment at /home/user/miniconda3:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
brotlipy                  0.7.0           py39h27cfd23_1003  
ca-certificates           2021.5.25            h06a4308_1  
certifi                   2021.5.30        py39h06a4308_0  
cffi                      1.14.5           py39h261ae71_0  
chardet                   4.0.0           py39h06a4308_1003  
conda                     4.10.1           py39h06a4308_1  
conda-package-handling    1.7.3            py39h27cfd23_1  
cryptography              3.4.7            py39hd23ed53_0  
idna                      2.10               pyhd3eb1b0_0  
ld_impl_linux-64          2.33.1               h53a641e_7  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 9.1.0                hdf63c60_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
ncurses                   6.2                  he6710b0_1  
openssl                   1.1.1k               h27cfd23_0  
pycosat                   0.6.3            py39h27cfd23_0  
pycparser                 2.20                       py_2  
pyopenssl                 20.0.1             pyhd3eb1b0_1  
pysocks                   1.7.1            py39h06a4308_0  
python                    3.9.1                hdb3f193_2  
readline                  8.1                  h27cfd23_0  
requests                  2.25.1             pyhd3eb1b0_0  
ruamel_yaml               0.15.100         py39h27cfd23_0  
setuptools                52.0.0           py39h06a4308_0  
six                       1.15.0           py39h06a4308_0  
sqlite                    3.35.4               hdfb4753_0  
tk                        8.6.10               hbc83047_0  
tqdm                      4.59.0             pyhd3eb1b0_1  
tzdata                    2020f                h52ac0ba_0  
urllib3                   1.26.4             pyhd3eb1b0_0  
xz                        5.2.5                h7b6447c_0  
yaml                      0.2.5                h7b6447c_0  
zlib                      1.2.11               h7b6447c_3  
echo $PYTHONPATH
$

(note - $PYTHONPATH is not set)

$echo $PATH
/home/user/miniconda3/bin:/home/user/miniconda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin

Logs
Full run.sh error output

[1623256558.538442] Loading Predictor
/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/torch/cuda/__init__.py:104: UserWarning: 
NVIDIA GeForce RTX 3070 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3070 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
[1623256559.794477] Trying camera with id 0
[1623256561.014920] Trying camera with id 1
[ WARN:0] global /io/opencv/modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video1): can't open camera by index
[1623256561.015168] Camera with id 1 is not available
[1623256561.015206] Trying camera with id 2
[ WARN:0] global /io/opencv/modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video2): can't open camera by index
[1623256561.015271] Camera with id 2 is not available
[1623256561.015292] Trying camera with id 3
[ WARN:0] global /io/opencv/modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video3): can't open camera by index
[1623256561.015345] Camera with id 3 is not available
[1623256561.015459] Selected camera 0
Traceback (most recent call last):
  File "afy/cam_fomm.py", line 254, in <module>
    change_avatar(predictor, avatars[cur_ava])
  File "afy/cam_fomm.py", line 91, in change_avatar
    avatar_kp = predictor.get_frame_kp(new_avatar)
  File "/home/user/tmp/b/avatarify-python/afy/predictor_local.py", line 114, in get_frame_kp
    kp_landmarks = self.fa.get_landmarks(image)
  File "/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/face_alignment/api.py", line 106, in get_landmarks
    return self.get_landmarks_from_image(image_or_path, detected_faces)
  File "/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/face_alignment/api.py", line 125, in get_landmarks_from_image
    detected_faces = self.face_detector.detect_from_image(image.copy())
  File "/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/face_alignment/detection/sfd/sfd_detector.py", line 44, in detect_from_image
    bboxlist = detect(self.face_detector, image, device=self.device)[0]
  File "/home/user/miniconda3/envs/avatarify/lib/python3.7/site-packages/face_alignment/detection/sfd/detect.py", line 15, in detect
    img = torch.from_numpy(img).to(device, dtype=torch.float32)
RuntimeError: CUDA error: no kernel image is available for execution on the device
FATAL: exception not rethrown
./run.sh: line 151: 142896 Aborted                 (core dumped) python afy/cam_fomm.py --config $FOMM_CONFIG --checkpoint $FOMM_CKPT --virt-cam $CAMID_VIRT --relative --adapt_scale $@
@JohanAR
Copy link
Collaborator

JohanAR commented Jun 10, 2021

Can't really investigate this myself, but I think this warning message could give you a place to look if you want to give it a try:

NVIDIA GeForce RTX 3070 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3070 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

Pull requests are welcome, in case you find something that ought to be added to avatarify-python :)

@JohanAR JohanAR added hardware support Hardware not supported or incompatible linux labels Jun 10, 2021
@darrellenns
Copy link
Author

I got it to work by replacing the install.sh conda commands with the following:

conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
conda install scikit-image python-blosc -c conda-forge

The first one is directly from the pytorch "get started" page. On the second one, I removed the specific versions to avoid version conflicts with the pytorch stuff. I also removed numpy from the install command, since it's already installed by pytorch.

@JohanAR
Copy link
Collaborator

JohanAR commented Jun 11, 2021

Nice that you fixed it! Hopefully alievk will have time to look at it, and decide if the same change can be done here, or if it would break something for people with older GPUs. Unfortunately I know nothing about conda

@zarklon
Copy link

zarklon commented Jul 27, 2021

Thanks but all this stuff is greek to me. I was surprised I got it to work the first time with step-by-step instructions. I have no friggin clue what to do to "adapt it to my OS"

@JohanAR
Copy link
Collaborator

JohanAR commented Jul 28, 2021

darrellenns' fix modifies install.sh which is for Linux, so if you're using Windows you'd have to make the corresponding changes to install_windows.bat

I.e. you find the following two lines

call conda install -y numpy==1.19.0 scikit-image python-blosc==1.7.0 -c conda-forge
call conda install -y pytorch==1.7.1 torchvision cudatoolkit=11.0 -c pytorch

replace them with

call conda install -y pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
call conda install -y scikit-image python-blosc -c conda-forge

and then run install_windows.bat from a Command Prompt (so that you see any possible error messages)

I can't verify this myself, since I don't have an RTX30xx card

@zarklon
Copy link

zarklon commented Jul 29, 2021 via email

@robson-credpago
Copy link

I had the same problem, but when I updated the torch witch cuda 11.x, torchvision got errors during start.

My solution was to download the wheel file from this list: https://download.pytorch.org/whl/cu116/torch/ and instaling it with pip.
For avatarify in specific, you will want to select pytortch 1.12.0 witch cp37 (python 3.7), cu116 is the only version of CUDA for 1.12.0.

@daddyiel
Copy link

daddyiel commented Apr 6, 2023

I had the same problem, but when I updated the torch witch cuda 11.x, torchvision got errors during start.

My solution was to download the wheel file from this list: https://download.pytorch.org/whl/cu116/torch/ and instaling it with pip. For avatarify in specific, you will want to select pytortch 1.12.0 witch cp37 (python 3.7), cu116 is the only version of CUDA for 1.12.0.

how did you install it with PIP

@Big-sly
Copy link

Big-sly commented May 28, 2023

Is there a working build for avatarify desktop or Python for rtx 30s graphic cards yet??

@JohanAR
Copy link
Collaborator

JohanAR commented May 28, 2023

@Big-sly I think people have successfully run it on rtx30xx cards, however there are currently issues with installation regardless of GPU so expect having to do some troubleshooting yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardware support Hardware not supported or incompatible linux
Projects
None yet
Development

No branches or pull requests

6 participants