RuntimeError: CUDA error: invalid device function #6

sbelharbi · 2021-09-02T22:10:56Z

hi,
thanks for this code.

i have 2 related questions.

q1. when running this example on my machine, i got this error:

Traceback (most recent call last):
  File "x/permut.py", line 29, in <module>
    torch.from_numpy(rgb / 0.125).cuda().float())
  File "x/PAM_cuda/pl.py", line 20, in forward
    rank, barycentric, blur_neighbours1, blur_neighbours2, indices = PermutohedralLattice.prepare(feat)
  File "x/PAM_cuda/pl.py", line 116, in prepare
    _ = HT_opp.insert(table, n_entries, loc[scit].type(torch.cuda.IntTensor), loc_hash[scit].type(torch.cuda.IntTensor))
RuntimeError: CUDA error: invalid device function
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

it is a segfault error.
the installation is done using

python setup.y build
python setup.y install

when running with $ CUDA_LAUNCH_BLOCKING=1 python permut.py , i got this:

Traceback (most recent call last):
  File "permut.py", line 29, in <module>
    torch.from_numpy(rgb / 0.125).cuda().float())
  File "x/PAM_cuda/pl.py", line 20, in forward
    rank, barycentric, blur_neighbours1, blur_neighbours2, indices = PermutohedralLattice.prepare(feat)
  File "x/PAM_cuda/pl.py", line 116, in prepare
    _ = HT_opp.insert(table, n_entries, loc[scit].type(torch.cuda.IntTensor), loc_hash[scit].type(torch.cuda.IntTensor))
RuntimeError: CUDA error: invalid device function
Segmentation fault (core dumped)

the used code is:

import sys
from os.path import dirname, abspath

import re
import torch.nn as nn
import torch
import torch.nn.functional as F

# path stuff
# path stuff

from PAM_cuda.pl import PermutohedralLattice

if __name__ == '__main__':
    import numpy as np
    import cv2
    import torch
    import matplotlib.pyplot as plt

    im = cv2.imread("dog.png")
    indices = np.reshape(np.indices(im.shape[:2]), (2, -1))[None, :]
    im = np.transpose(im, (2, 0, 1))
    rgb = np.reshape(im, (3, -1))[None, :]

    pl = PermutohedralLattice.apply

    out = pl(torch.from_numpy(indices / 5.0).cuda().float(),
             torch.from_numpy(rgb / 0.125).cuda().float())

    output = out.squeeze().cpu().numpy()
    output = np.transpose(output, (1, 0))
    output = np.reshape(output, (im.shape[1], im.shape[2], 3))

    plt.imshow(output / output.max())
    plt.imshow(np.transpose(im, (1, 2, 0)))

any idea how to fix this?

i will post the other question in a separate issue.
thanks for your help

info:
conda virtual env: conda create -n env_test python=3.7
python 3.7.9
pytorch 1.9.0 installed with conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia
cv2 4.1.2

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

CUDA Version with nvcc-smi: 11.1
gpu: p100

nvisia-smi:
NVIDIA-SMI 455.32.00
Driver Version: 455.32.00

so far , i tested only on one server, where i expected the example to work.
let me know if you need more info.
the virtual env is within conda.

thanks

The text was updated successfully, but these errors were encountered:

ptrblck · 2021-09-06T08:52:55Z

The issue is most likely caused by mixing different CUDA releases as described here.

SamuelJoutard · 2021-09-06T12:26:47Z

Hi,

Thanks a lot for your help @ptrblck, I am not sure to be able to fully explain your error Soufiane but @ptrblck answer makes sense to me! I will close this issue as it seems to be related to your specific setting. Do not hesitate to re-open I am wrong.

sbelharbi · 2021-09-06T13:57:52Z

hi all,
not really sure why, nvidia-smi was pointing to the right cuda runtime 11.1 that i need but it was wrong because the true nvcc was 10. which indeed was the problem.
so, i fixed the paths, and successfully run the example!!
thank you very much @ptrblck and Samuel!!!

best

This was referenced Sep 2, 2021

PermutohedralLattice.apply #7

Closed

PermutohedralLattice module. KCL-BMEIS/ScribbleDA#3

Closed

build_hash CUDA kernel failure: invalid device function. pytorch 1.9.0 HapeMask/crfrnn_layer#7

Closed

SamuelJoutard closed this as completed Sep 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: invalid device function #6

RuntimeError: CUDA error: invalid device function #6

sbelharbi commented Sep 2, 2021 •

edited

ptrblck commented Sep 6, 2021

SamuelJoutard commented Sep 6, 2021

sbelharbi commented Sep 6, 2021

RuntimeError: CUDA error: invalid device function #6

RuntimeError: CUDA error: invalid device function #6

Comments

sbelharbi commented Sep 2, 2021 • edited

ptrblck commented Sep 6, 2021

SamuelJoutard commented Sep 6, 2021

sbelharbi commented Sep 6, 2021

sbelharbi commented Sep 2, 2021 •

edited