Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid configuration argument (with python trainGMMOT.py) #13

Closed
1359347500cwc opened this issue Nov 15, 2021 · 3 comments

Comments

@1359347500cwc
Copy link

When I train the model on GTX TITAN 12G CUDA10.2 it report the RuntimeError: CUDA error: invalid configuration argument

Here is the full Traceback

(GMTracker) cwc@imc-Z9PE-D8-WS:~/GMTracker$ python trainGMMOT.py
MOT17-04
210
/home/cwc/anaconda3/envs/GMTracker/lib/python3.6/site-packages/torch_geometric/deprecation.py:13: UserWarning: 'data.DataLoader' is deprecated, use 'loader.DataLoader' instead
warnings.warn(out)
Start training...
Epoch 0/1
lr = 1.00e-05

Traceback (most recent call last):
File "trainGMMOT.py", line 183, in
scheduler
File "trainGMMOT.py", line 104, in train_model
loss.backward()
File "/home/cwc/anaconda3/envs/GMTracker/lib/python3.6/site-packages/torch/tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/cwc/anaconda3/envs/GMTracker/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
File "/home/cwc/anaconda3/envs/GMTracker/lib/python3.6/site-packages/torch/autograd/function.py", line 77, in apply
return self._forward_cls.backward(self, *args)
File "/home/cwc/GMTracker/qpth/qpth/qp.py", line 144, in backward
ctx.Q_LU, ctx.S_LU, ctx.R = pdipm_b.pre_factor_kkt(Q, G, A)
File "/home/cwc/GMTracker/qpth/qpth/solvers/pdipm/batch.py", line 395, in pre_factor_kkt
G_invQ_GT = torch.bmm(G, G.transpose(1, 2).lu_solve(*Q_LU))
RuntimeError: CUDA error: invalid configuration argument

@jiaweihe1996
Copy link
Owner

jiaweihe1996 commented Nov 15, 2021

What is your pytorch version? According to my experiments, pytorch 1.4.0 should report a warning about large matrix in MAGMA instead of a CUDA error. Maybe in the higher version of pytorch, there is a problem. Refer to locuslab/qpth#37, pytorch/pytorch#61815.
A practical solution is using pytorch 1.4.0, or using the pytorch after the commit 6d21e36f210b5e377941c98568099c819aaaea01.

@1359347500cwc
Copy link
Author

my pytorch version is 1.4.0 but it also report this error.

@jiaweihe1996
Copy link
Owner

How about referring to pytorch/pytorch#61815?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants