Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:425 #14

Open
dbdxnuliba opened this issue Oct 9, 2022 · 2 comments

Comments

@dbdxnuliba
Copy link

(CenterPose) dell1804@dell1804-G3-3590:~/center_pose_ws/CenterPose/src$ python demo.py --demo ../data/book.jpg --arch dlav1_34 --load_model ../models/CenterPose/book_v1_140.pth
/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
FutureWarning)
Fix size testing.
training chunk_sizes: [1]
The output will be saved to /home/dell1804/center_pose_ws/CenterPose/src/lib/../../exp/object_pose/default
heads {'hm': 1, 'wh': 2, 'hps': 16, 'reg': 2, 'hm_hp': 8, 'hp_offset': 2, 'scale': 3}
Creating model...
loaded ../models/CenterPose/book_v1_140.pth, epoch 140
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
Traceback (most recent call last):
File "demo.py", line 156, in
demo(opt, meta)
File "demo.py", line 83, in demo
ret = detector.run(image_name, meta_inp=meta)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/detectors/base_detector.py", line 474, in run
images, self.pre_images, pre_hms, pre_hm_hp, pre_inds, return_time=True)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/detectors/object_pose.py", line 135, in process
output = self.model(images, pre_images, pre_hms, pre_hm_hp)[-1]
File "/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 531, in forward
x = self.dla_up(x)
File "/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 441, in forward
ida(layers, len(layers) - i - 2, len(layers))
File "/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 415, in forward
layers[i] = upsample(project(layers[i]))
File "/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 387, in forward
x = self.conv(x)
File "/home/dell1804/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/DCNv2/dcn_v2.py", line 128, in forward
self.deformable_groups)
File "/home/dell1804/center_pose_ws/CenterPose/src/lib/models/networks/DCNv2/dcn_v2.py", line 31, in forward
ctx.deformable_groups)
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:425

(CenterPose) dell1804@dell1804-G3-3590:~/center_pose_ws/CenterPose/src$ nvidia-smi
/usr/bin/nvidia-modprobe: unrecognized option: "-s"

ERROR: Invalid commandline, please run /usr/bin/nvidia-modprobe --help for usage information.

/usr/bin/nvidia-modprobe: unrecognized option: "-s"

ERROR: Invalid commandline, please run /usr/bin/nvidia-modprobe --help for usage information.

Sun Oct 9 11:10:47 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03 Driver Version: 470.141.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 50C P8 2W / N/A | 1083MiB / 3911MiB | 14% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1165 G /usr/lib/xorg/Xorg 226MiB |
| 0 N/A N/A 1846 G /usr/bin/gnome-shell 50MiB |
| 0 N/A N/A 3778 G ...428520904353170423,131072 72MiB |
| 0 N/A N/A 24592 C python 727MiB |
+-----------------------------------------------------------------------------+

Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.version
'1.1.0'

@dbdxnuliba
Copy link
Author

I has fixed the issue, by
use
torch==1.4.0
torchvision==0.5.0

@OliviaZhang1996
Copy link

If the torch and torchvision changes, the version of DCNv2 should also be changed. The correct version of DCNv2 can be downloaded from the related branch. For example, if your torchvision=1.6.0, then you should download https://github.com/lucasjinreal/DCNv2_latest/archive/refs/heads/pytorch1.6.zip to replace the existing DCNv2 code. Then run sh ./make.sh. The make.sh can run smoothly after deleting the top two lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants