Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid device function #6

Closed
sbelharbi opened this issue Sep 2, 2021 · 3 comments
Closed

RuntimeError: CUDA error: invalid device function #6

sbelharbi opened this issue Sep 2, 2021 · 3 comments

Comments

@sbelharbi
Copy link

sbelharbi commented Sep 2, 2021

hi,
thanks for this code.

i have 2 related questions.

q1. when running this example on my machine, i got this error:

Traceback (most recent call last):
  File "x/permut.py", line 29, in <module>
    torch.from_numpy(rgb / 0.125).cuda().float())
  File "x/PAM_cuda/pl.py", line 20, in forward
    rank, barycentric, blur_neighbours1, blur_neighbours2, indices = PermutohedralLattice.prepare(feat)
  File "x/PAM_cuda/pl.py", line 116, in prepare
    _ = HT_opp.insert(table, n_entries, loc[scit].type(torch.cuda.IntTensor), loc_hash[scit].type(torch.cuda.IntTensor))
RuntimeError: CUDA error: invalid device function
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

it is a segfault error.
the installation is done using

python setup.y build
python setup.y install

when running with $ CUDA_LAUNCH_BLOCKING=1 python permut.py , i got this:

Traceback (most recent call last):
  File "permut.py", line 29, in <module>
    torch.from_numpy(rgb / 0.125).cuda().float())
  File "x/PAM_cuda/pl.py", line 20, in forward
    rank, barycentric, blur_neighbours1, blur_neighbours2, indices = PermutohedralLattice.prepare(feat)
  File "x/PAM_cuda/pl.py", line 116, in prepare
    _ = HT_opp.insert(table, n_entries, loc[scit].type(torch.cuda.IntTensor), loc_hash[scit].type(torch.cuda.IntTensor))
RuntimeError: CUDA error: invalid device function
Segmentation fault (core dumped)

the used code is:

import sys
from os.path import dirname, abspath

import re
import torch.nn as nn
import torch
import torch.nn.functional as F

# path stuff
# path stuff

from PAM_cuda.pl import PermutohedralLattice

if __name__ == '__main__':
    import numpy as np
    import cv2
    import torch
    import matplotlib.pyplot as plt

    im = cv2.imread("dog.png")
    indices = np.reshape(np.indices(im.shape[:2]), (2, -1))[None, :]
    im = np.transpose(im, (2, 0, 1))
    rgb = np.reshape(im, (3, -1))[None, :]

    pl = PermutohedralLattice.apply

    out = pl(torch.from_numpy(indices / 5.0).cuda().float(),
             torch.from_numpy(rgb / 0.125).cuda().float())

    output = out.squeeze().cpu().numpy()
    output = np.transpose(output, (1, 0))
    output = np.reshape(output, (im.shape[1], im.shape[2], 3))

    plt.imshow(output / output.max())
    plt.imshow(np.transpose(im, (1, 2, 0)))

any idea how to fix this?

i will post the other question in a separate issue.
thanks for your help

info:
conda virtual env: conda create -n env_test python=3.7
python 3.7.9
pytorch 1.9.0 installed with conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia
cv2 4.1.2

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

CUDA Version with nvcc-smi: 11.1
gpu: p100

nvisia-smi:
NVIDIA-SMI 455.32.00
Driver Version: 455.32.00

so far , i tested only on one server, where i expected the example to work.
let me know if you need more info.
the virtual env is within conda.

thanks

@ptrblck
Copy link

ptrblck commented Sep 6, 2021

The issue is most likely caused by mixing different CUDA releases as described here.

@SamuelJoutard
Copy link
Owner

Hi,

Thanks a lot for your help @ptrblck, I am not sure to be able to fully explain your error Soufiane but @ptrblck answer makes sense to me! I will close this issue as it seems to be related to your specific setting. Do not hesitate to re-open I am wrong.

@sbelharbi
Copy link
Author

hi all,
not really sure why, nvidia-smi was pointing to the right cuda runtime 11.1 that i need but it was wrong because the true nvcc was 10. which indeed was the problem.
so, i fixed the paths, and successfully run the example!!
thank you very much @ptrblck and Samuel!!!

best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants