Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train.py 执行后无反应 RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED #170

Open
miomiora opened this issue Dec 11, 2023 · 0 comments

Comments

@miomiora
Copy link

Initialzing...
Initializing data source...
Data initialization complete.
Initializing model...
Model initialization complete.
Training START
Traceback (most recent call last):
  File "train.py", line 21, in <module>
    m.fit()
  File "/home/czk/code/GaitSet/model/model.py", line 159, in fit
    feature, label_prob = self.encoder(*seq, batch_frame)
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/czk/code/GaitSet/model/network/gaitset.py", line 90, in forward
    x = self.set_layer1(x)
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/czk/code/GaitSet/model/network/basic_blocks.py", line 24, in forward
    x = self.forward_block(x.view(-1,c,h,w))
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/czk/code/GaitSet/model/network/basic_blocks.py", line 11, in forward
    x = self.conv(x)
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/czk/anaconda3/envs/GaitSet/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED

执行 python train.py 之后就无反应了,等很久之后会报这个错

pip list
Package         Version
--------------- ------------
certifi         2021.5.30
cffi            1.14.6
imageio         2.15.0
mkl-fft         1.0.6
mkl-random      1.0.1
numpy           1.15.4
opencv-python   4.1.2.30
pandas          1.1.5
Pillow          8.4.0
pip             21.2.2
pycparser       2.21
python-dateutil 2.8.2
pytz            2023.3.post1
scipy           1.5.4
setuptools      58.0.4
six             1.16.0
TBB             0.2
torch           0.4.1
wheel           0.37.1
xarray          0.16.2

执行 python train.py 过程中 GPU 只会有一点占用

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant