RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

BrianPugh · 2019-10-14T20:18:51Z

Attempting to forward inference the panoptic fpn model results in a CUDA error.

To Reproduce

Attempting to run a predictor using the model panoptic_fpn_R_101_dconv_cascade_gn_3x.yaml.

The following error is produced:

error in deformable_im2col: invalid device function
... < repeated ~30 times> ...
File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/app/detectron2/detectron2/engine/defaults.py", line 176, in __call__
    predictions = self.model([inputs])[0]
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/meta_arch/panoptic_fpn.py", line 98, in forward
    images, features, proposals, gt_instances
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 97, in forward
    pred_instances = self._forward_box(features_list, proposals)
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 112, in _forward_box
    head_outputs.append(self._run_stage(features, proposals, k))
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 203, in _run_stage
    box_features = self.box_pooler(features, [x.proposal_boxes for x in proposals])
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/poolers.py", line 192, in forward
    output[inds] = pooler(x_level, pooler_fmt_boxes_level)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/layers/roi_align.py", line 95, in forward
    input, rois, self.output_size, self.spatial_scale, self.sampling_ratio, self.aligned
  File "/app/detectron2/detectron2/layers/roi_align.py", line 20, in forward
    input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: invalid device function (ROIAlign_forward_cuda at /app/detectron2/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:359)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7ffa5c402687 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xa37 (0x7ffa0653b6f5 in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #2: ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xbc (0x7ffa064c9fdc in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x5961a (0x7ffa064db61a in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x5971e (0x7ffa064db71e in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x53ca0 (0x7ffa064d5ca0 in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #12: THPFunction_apply(_object*, _object*) + 0x9ff (0x7ffa5d63dacf in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

Environment

---------------------  -------------------------------------------------------------------
Python                 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0]
Detectron2 Compiler    GCC 5.4
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.3.0
PyTorch Debug Build    False
CUDA available         True
GPU 0,1                GeForce GTX 1080 Ti
Pillow                 6.2.0
cv2                    3.4.4
---------------------  -------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF

The text was updated successfully, but these errors were encountered:

ppwwyyxx · 2019-10-14T20:25:20Z

It seems like you did not build detectron2 correctly. You may have wrong values in the TORCH_CUDA_ARCH_LIST environment variable when you build it. Could you check this environment variable at the time you build it?

BrianPugh · 2019-10-14T20:42:40Z

I deleted the build folder and the detectron2/_C.cpython-36m-x86_64-linux-gnu.so file and rebuilt running the command in the root repo directory

TORCH_CUDA_ARCH_LIST="6.1;7.5" pip install -e .

I'm running on a 1080ti, which should be covered under "6.1". This results in the same errors as above.

ppwwyyxx · 2019-10-14T21:10:45Z

Is there a way either of you can let others reproduce this issue in docker or colab?

weston100 · 2019-10-15T02:21:39Z

I'm actually just trying to get object detection on LVIS running, and I'm able to successfully run the model when I switch the maskrcnn backbone out for a retinanet (which doesn't use ROIAlign). I unfortunately don't have time rn to try to set up docker or colab to replicate.

ppwwyyxx · 2019-10-15T23:36:31Z

I was able to reproduce the same error when I use the wrong version of cuda.

What I did:
I install pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, however my local cuda runtime and nvcc are in 10.0.
In this case, I can observe the same error.
Please check whether your cuda version is correct.

ppwwyyxx · 2019-10-16T04:09:43Z

The updated collect_env in e85114c can now show the type of error I met.

batrlatom · 2019-10-16T07:39:51Z

as followup from #78 . I installed new env with CUDA 9.2 and this solved my issue. Could the problem be since as stated at https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md all models are trained with CUDA 9.2 ?

ppwwyyxx · 2019-10-16T08:23:31Z

No it's unrelated to model zoo.
It's likely because cuda 9.2 is just what your computer is using.

ishann · 2019-10-17T20:10:53Z

I ran into this error as well. Re-installed Pytorch corresponding to a lower CUDA version (that matches my system CUDA). I was able to resolve the issue.

ppwwyyxx · 2019-10-29T13:53:50Z

It seems that mismatched NVCC vs CUDA Runtime version is the root cause. Closing but feel free to reopen if this does not solve your issue.

gunshi · 2019-11-04T15:13:01Z

@ppwwyyxx what should TORCH_CUDA_ARCH_LIST ideally be set to if one is using cuda/10.0 or cuda/10.1 with pytorch 1.3? nvcc --version shows me cuda 10.0 as well, I'm not sure what you mean by ^^ mismatch between nvcc and cuda runtimes since they're always the same for me.
The build happens successfully but I get this error upon running demo.py:

RuntimeError: CUDA error: no kernel image is available for execution on the device (ROIAlign_forward_cuda at /network/home/guptagun/od/detectron2_repo/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:361)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f010803e687 in /network/home/guptagun/anaconda3/envs/detectron/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: detectron2::ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xa24 (0x7f01065ac89c in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #2: detectron2::ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xb6 (0x7f010654df66 in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x4ec8f (0x7f010655fc8f in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x49750 (0x7f010655a750 in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #9: THPFunction_apply(_object*, _object*) + 0x8d6 (0x7f010a180e96 in /network/home/guptagun/anaconda3/envs/detectron/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

Posting the error here because it seems related, can make a new issue if you recommend.
Thanks!

ppwwyyxx · 2019-11-04T16:24:38Z

what should TORCH_CUDA_ARCH_LIST ideally be set

The best option is to unset it (i.e., no such env variable).

If you cannot solve the issue with existing information, please open a new one following the template.

wangerxiao001 · 2020-03-31T06:47:15Z

I was able to reproduce the same error when I use the wrong version of cuda.

What I did:
I install pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, however my local cuda runtime and nvcc are in 10.0.
In this case, I can observe the same error.
Please check whether your cuda version is correct.

Hello, I want to use detectron2. but when I prepared the conda environment, something went wrong. First I installed pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, but as you mentioned, I got error and found that my local cuda runtime and nvcc are in 10.0 (I build my conda enironment in LXD container, and I have no right to change local cuda runtime and nvcc version.). So I used conda install -c pytorch pytorch=1.3.0 cudatoolkit=10.0 to install pytorch for cuda 10.0. However, I got the error issue 459, I can only choose cuda 9.0 or cuda 10.0, and I see detectron can only run with cuda 9.2 and cuda 10.1. Could you please tell me how can I solve this?

ppwwyyxx · 2020-03-31T07:40:22Z

Detectron2 can run with cuda 10.0.

#459 is caused by incorrect installation of torchvision as explained there.

wangerxiao001 · 2020-03-31T10:08:23Z

Detectron2 can run with cuda 10.0.

#459 is caused by incorrect installation of torchvision as explained there.

Thanks for your reply,I delete the build file in detedtron2 and rebuild it, it works well for me now

yohanshin · 2020-04-01T00:53:50Z

Hi, I am just trying to run detectron2 for panoptic segmentation with PyTorch 1.4.0 and CUDA 10.2, I encountered same cuda error for ROIAlign_forward_cuda . I tried to install detectron2 using 1) local source code, and 2) pip install. I also double checked that python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/index.html and it seems CUDA10.2 is also compatible with detectron2. What kind of further step can I take?

ppwwyyxx · 2020-04-01T01:11:00Z

Most likely the solution to your problem is already in https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues.
If you need help to solve an unexpected issue you observed, please include details following the issue template.

yohanshin · 2020-04-01T02:35:44Z

@ppwwyyxx

Great thanks! I checked that CUDA version for detectron2 and torch are mis-matched. I just re-install detectron2 with CUDA 10.1 and match pytorch as well. Now it works! Thanks again

ignitemylife · 2020-06-01T05:00:31Z

I was able to reproduce the same error when I use the wrong version of cuda.

What I did:
I install pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, however my local cuda runtime and nvcc are in 10.0.
In this case, I can observe the same error.
Please check whether your cuda version is correct.

So, how did you solve this?

zjZSTU · 2020-06-14T12:52:19Z

It seems that mismatched NVCC vs CUDA Runtime version is the root cause. Closing but feel free to reopen if this does not solve your issue.

Yes, that's the key to solve my problem

Problem

first, briefly introduce my problem: I'm new to Detectron2 and only one GPU(GeForce GTX 1080Ti). I choose to build Detectron2 from Source:

# Or, to install it from a local clone:
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

everything is fine and detectron2 is installed successfully

$ python -m pip install -e detectron2
Obtaining file:///home/lab305/ZhuJian/detectron2
Requirement already satisfied: termcolor>=1.1 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (1.1.0)
Requirement already satisfied: Pillow in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (7.0.0)
Requirement already satisfied: yacs>=0.1.6 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (0.1.7)
Requirement already satisfied: tabulate in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (0.8.7)
Requirement already satisfied: cloudpickle in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (1.4.1)
Requirement already satisfied: matplotlib in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (3.1.2)
Requirement already satisfied: mock in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (4.0.2)
Requirement already satisfied: tqdm>4.29.0 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (4.46.0)
Requirement already satisfied: tensorboard in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (2.0.0)
Requirement already satisfied: fvcore>=0.1.1 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (0.1.1.post200513)
Requirement already satisfied: future in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (0.18.2)
Requirement already satisfied: pydot in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from detectron2==0.1.3) (1.4.1)
Requirement already satisfied: PyYAML in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from yacs>=0.1.6->detectron2==0.1.3) (5.3.1)
Requirement already satisfied: cycler>=0.10 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from matplotlib->detectron2==0.1.3) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from matplotlib->detectron2==0.1.3) (2.4.6)
Requirement already satisfied: numpy>=1.11 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from matplotlib->detectron2==0.1.3) (1.18.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from matplotlib->detectron2==0.1.3) (1.1.0)
Requirement already satisfied: python-dateutil>=2.1 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from matplotlib->detectron2==0.1.3) (2.8.1)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (0.33.6)
Requirement already satisfied: protobuf>=3.6.0 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (3.11.2)
Requirement already satisfied: werkzeug>=0.11.15 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (0.16.0)
Requirement already satisfied: six>=1.10.0 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (1.13.0)
Requirement already satisfied: absl-py>=0.4 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (0.8.1)
Requirement already satisfied: setuptools>=41.0.0 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (44.0.0.post20200106)
Requirement already satisfied: markdown>=2.6.8 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (3.1.1)
Requirement already satisfied: grpcio>=1.6.3 in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from tensorboard->detectron2==0.1.3) (1.16.1)
Requirement already satisfied: portalocker in /home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages (from fvcore>=0.1.1->detectron2==0.1.3) (1.7.0)
Installing collected packages: detectron2
  Found existing installation: detectron2 0.1.3
    Uninstalling detectron2-0.1.3:
      Successfully uninstalled detectron2-0.1.3
  Running setup.py develop for detectron2
Successfully installed detectron2

but when I try to train

$ ./train_net.py   --config-file ../configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml   --num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
...
...
[06/14 17:31:31 d2.engine.train_loop]: Starting training from iteration 0
ERROR [06/14 17:31:32 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/train_loop.py", line 215, in run_step
    loss_dict = self.model(data)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 123, in forward
    _, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 426, in forward
    [features[f] for f in self.in_features], proposal_boxes
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 410, in _shared_roi_transform
    x = self.pooler(features, boxes)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/poolers.py", line 214, in forward
    return self.level_poolers[0](x[0], pooler_fmt_boxes)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/layers/roi_align.py", line 95, in forward
    input, rois, self.output_size, self.spatial_scale, self.sampling_ratio, self.aligned
  File "/home/lab305/ZhuJian/detectron2/detectron2/layers/roi_align.py", line 20, in forward
    input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: invalid device function
[06/14 17:31:32 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks)
Traceback (most recent call last):
  File "./train_net.py", line 169, in <module>
    args=(args,),
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/launch.py", line 57, in launch
    main_func(*args)
  File "./train_net.py", line 157, in main
    return trainer.train()
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/defaults.py", line 402, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/home/lab305/ZhuJian/detectron2/detectron2/engine/train_loop.py", line 215, in run_step
    loss_dict = self.model(data)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 123, in forward
    _, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 426, in forward
    [features[f] for f in self.in_features], proposal_boxes
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 410, in _shared_roi_transform
    x = self.pooler(features, boxes)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/modeling/poolers.py", line 214, in forward
    return self.level_poolers[0](x[0], pooler_fmt_boxes)
  File "/home/lab305/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lab305/ZhuJian/detectron2/detectron2/layers/roi_align.py", line 95, in forward
    input, rois, self.output_size, self.spatial_scale, self.sampling_ratio, self.aligned
  File "/home/lab305/ZhuJian/detectron2/detectron2/layers/roi_align.py", line 20, in forward
    input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: invalid device function
段错误 (核心已转储)

Solve

I check the cuda version

# nvidia-smi
CUDA Version: 10.2
# nvcc --version
Cuda compilation tools, release 10.0, V10.0.130

before this I install cudatoolkit=10.2，but now i choose the earlier version

conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

after rebuilt Detectron2，the problem solved！！！

yohanshin · 2020-06-15T02:47:05Z

I was able to reproduce the same error when I use the wrong version of cuda.
What I did:
I install pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, however my local cuda runtime and nvcc are in 10.0.
In this case, I can observe the same error.
Please check whether your cuda version is correct.

So, how did you solve this?

Oh Sorry for late response, I totally missed it. As I mentioned, my prev CUDA version was 10.1 but I installed PyTorch and Detectron2 with compatibility of CUDA 10.2. Thus, I reinstalled those two to meet compatibility with my CUDA version. I think @zjZSTU 's solution is somewhat close to mine, you can refer to this.

weston100 mentioned this issue Oct 14, 2019

LVIS Training fails to find ROIAlign_forward_cuda #63

Closed

ppwwyyxx added the installation / environment label Oct 14, 2019

This was referenced Oct 15, 2019

Hi, the demo script gives segmentation fault on the example provided. #42

Closed

Core dumped after running demo code #78

Closed

ppwwyyxx closed this as completed Oct 29, 2019

gunshi mentioned this issue Nov 4, 2019

RuntimeError: CUDA error: no kernel image is available for execution on the device #235

Closed

XuanyuanDi mentioned this issue Nov 7, 2019

RuntimeError: Not compiled with GPU support (ROIAlign_forward at /home/hd/detectron2_repo/detectron2/layers/csrc/ROIAlign/ROIAlign.h:73) #267

Closed

chrischoy mentioned this issue Jan 8, 2020

RuntimeError: invalid device function at src/convolution.cu:283 NVIDIA/MinkowskiEngine#73

Closed

servercalap mentioned this issue Feb 17, 2020

custom dataset and custom train_net.py runtime error #893

Closed

nicolasugrinovic mentioned this issue Mar 24, 2020

CUDA error: invalid device function in kaolin.metrics.point.chamfer_distance NVIDIAGameWorks/kaolin#182

Closed

hyangwinter mentioned this issue Sep 2, 2020

CUDA kernel failed : invalid device function hyangwinter/flownet3d_pytorch#2

Open

ShoufaChen mentioned this issue Jan 19, 2021

RuntimeError: Not compiled with GPU support although print(torch.cuda.is_available(), CUDA_HOME) successfully #2501

Closed

alexandrosstergiou mentioned this issue Jan 19, 2021

cudaCheckError() failed : invalid device function alexandrosstergiou/SoftPool#13

Closed

github-actions bot locked as resolved and limited conversation to collaborators Apr 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

BrianPugh commented Oct 14, 2019 •

edited

Loading

ppwwyyxx commented Oct 14, 2019

BrianPugh commented Oct 14, 2019

ppwwyyxx commented Oct 14, 2019

weston100 commented Oct 15, 2019

ppwwyyxx commented Oct 15, 2019

ppwwyyxx commented Oct 16, 2019

batrlatom commented Oct 16, 2019

ppwwyyxx commented Oct 16, 2019

ishann commented Oct 17, 2019

ppwwyyxx commented Oct 29, 2019

gunshi commented Nov 4, 2019 •

edited

Loading

ppwwyyxx commented Nov 4, 2019

wangerxiao001 commented Mar 31, 2020

ppwwyyxx commented Mar 31, 2020

wangerxiao001 commented Mar 31, 2020

yohanshin commented Apr 1, 2020

ppwwyyxx commented Apr 1, 2020

yohanshin commented Apr 1, 2020

ignitemylife commented Jun 1, 2020

zjZSTU commented Jun 14, 2020

yohanshin commented Jun 15, 2020

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

Comments

BrianPugh commented Oct 14, 2019 • edited Loading

To Reproduce

Environment

ppwwyyxx commented Oct 14, 2019

BrianPugh commented Oct 14, 2019

ppwwyyxx commented Oct 14, 2019

weston100 commented Oct 15, 2019

ppwwyyxx commented Oct 15, 2019

ppwwyyxx commented Oct 16, 2019

batrlatom commented Oct 16, 2019

ppwwyyxx commented Oct 16, 2019

ishann commented Oct 17, 2019

ppwwyyxx commented Oct 29, 2019

gunshi commented Nov 4, 2019 • edited Loading

ppwwyyxx commented Nov 4, 2019

wangerxiao001 commented Mar 31, 2020

ppwwyyxx commented Mar 31, 2020

wangerxiao001 commented Mar 31, 2020

yohanshin commented Apr 1, 2020

ppwwyyxx commented Apr 1, 2020

yohanshin commented Apr 1, 2020

ignitemylife commented Jun 1, 2020

zjZSTU commented Jun 14, 2020

Problem

Solve

yohanshin commented Jun 15, 2020

BrianPugh commented Oct 14, 2019 •

edited

Loading

gunshi commented Nov 4, 2019 •

edited

Loading