Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Encountered CUDA error: invalid device function Error from operator: #245

Open
anatlin opened this issue Mar 7, 2018 · 12 comments
Open

Comments

@anatlin
Copy link

anatlin commented Mar 7, 2018

Expected results

This test to pass.

python2 $DETECTRON/tests/test_spatial_narrow_as_op.py 

Actual results

RuntimeError: [enforce fail at context_gpu.h:171] . Encountered CUDA error: invalid device function Error from operator: 
input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true

Detailed steps to reproduce

python2 $DETECTRON/tests/test_spatial_narrow_as_op.py 

System information

  • Operating system: Linux 4.9.76.1.amd64-smp Download links for ImageNet pretrained weights not working #1 SMP Thu Jan 11 22:28:16 CET 2018 x86_64 GNU/Linux
  • Compiler version: gcc version 4.9.2 (Debian 4.9.2-10)
  • CUDA version: Cuda compilation tools, release 8.0, V8.0.44
  • cuDNN version: cudnn-8.0-linux-x64-v7
  • NVIDIA driver version: Driver Version: 375.39
  • GPU models (for all devices if they are not all the same): Tesla K40m
  • PYTHONPATH environment variable: ?
  • python --version output: Python 2.7.14 :: Anaconda, Inc.
  • Anything else that seems relevant: I have followed the installation instructions for Caffe2 using Anaconda on Ubuntu. Everything went well so far.
conda install -c caffe2 caffe2-cuda8.0-cudnn7
@avilash
Copy link

avilash commented Mar 9, 2018

Any updates on the issue ?

2 similar comments
@manoshape
Copy link

Any updates on the issue ?

@xfarxod
Copy link

xfarxod commented Mar 14, 2018

Any updates on the issue ?

@avilash
Copy link

avilash commented Mar 14, 2018

Please build caffe2 from source.
Works on an AWS instance when caffe2 is built from source

@ljd16
Copy link

ljd16 commented Apr 20, 2018

Any updates on the issue ?

1 similar comment
@ggaaooppeenngg
Copy link

Any updates on the issue ?

@arasharchor
Copy link

arasharchor commented Apr 21, 2018

System information
Operating system: Ubuntu16.04
CUDA version: Cuda compilation tools, release 8.0, V8.0.44
cuDNN version: cudnn-8.0-linux-x64-v7
GPU models (for all devices if they are not all the same): Geforce 1060
PYTHONPATH environment variable: Anacodna2.7
caffe2 binary was installed using: conda install -c caffe2 caffe2-cuda8.0-cudnn7

Detectron$ python2 -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"
Success
Detectron$ python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
1

export PATH=/usr/local/cuda-8.0/bin:$PATH
echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64:/home/majid/softwares/cudnn/8.0-7.1/lib64
@rbgirshick I just experienced the same issue

when I run

python2 $DETECTRON/tests/test_spatial_narrow_as_op.py 

I get the following error:

RuntimeError: [enforce fail at context_gpu.h:171] . Encountered CUDA error: invalid device function Error from operator: 
input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true

after running

python2 tools/infer_simple.py     --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml     --output-dir /tmp/detectron-visualizations     --image-ext jpg     --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl     demo
python2 tools/train_net.py     --cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml     OUTPUT_DIR /tmp/detectron-output

I get the following error:

RuntimeError: [enforce fail at context_gpu.h:155] . Encountered CUDA error: invalid device function Error from operator: 
input: "gpu_0/conv1" input: "gpu_0/res_conv1_bn_s" input: "gpu_0/res_conv1_bn_b" output: "gpu_0/conv1" name: "" type: "AffineChannel" device_option { device_type: 1 cuda_gpu_id: 0 }

@mfe7
Copy link

mfe7 commented Apr 25, 2018

I have also gotten around this error by building Caffe2 from source

@arasharchor
Copy link

arasharchor commented Apr 25, 2018

@mfe7 , I was able to compile caffe2 from source after a lot of desperate try. Basically, the solution was not that complicated. I was using virtualenv and I was also compiling everything locally. When I installed every package including caffe2 with sudo permission in ubuntu. I worked like a charm and I was able to train with my own custom dataset with amazing results. Currently I am trying to compile it in another machine in which I have no sudo permission. If I can manage that, I will try to post an update here. I am preparing a bash file which you can run easily if you have sudo permission.

@ljd16
Copy link

ljd16 commented May 10, 2018

I compiled caffe2 from source without sudo, and error disappeared.
https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile

@rowanz
Copy link

rowanz commented Aug 11, 2018

I was encountering a variant of this issue when using the unsupported python 3 fork. However, I found that I didn't have to install caffe2 from source, just install an older version:
conda install -c caffe2 caffe2-cuda8.0-cudnn7=0.8.dev=py36_2018.05.14
hope this helps someone 😄

@Sqrt5
Copy link

Sqrt5 commented Aug 17, 2018

same error
RuntimeError: [enforce fail at context_gpu.h:181] . Encountered CUDA error: invalid device functionError from operator:
input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
but i use python2.7, so install old version and problem solved.
conda remove caffe2-cuda8.0-cudnn7
conda install -c caffe2 caffe2-cuda8.0-cudnn7=0.8.dev=py27_2018.05.14
thanks for @rowanz 's reply

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants