You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
每次跑完8个batch后出现这个问题,改batchsize没用,都是8个batch后报错。
机器装了3块GPU,设置的GPU_ID = 1
Files already downloaded and verified
Files already downloaded and verified
Train - Epoch 1, Batch: 0, Loss: 2.296886, Time 5.307902
Train - Epoch 1, Batch: 1, Loss: 2.301040, Time 0.105161
Train - Epoch 1, Batch: 2, Loss: 2.300776, Time 0.110913
Train - Epoch 1, Batch: 3, Loss: 2.303986, Time 0.104652
Train - Epoch 1, Batch: 4, Loss: 2.289750, Time 0.100140
Train - Epoch 1, Batch: 5, Loss: 2.315252, Time 0.099318
Train - Epoch 1, Batch: 6, Loss: 2.298506, Time 0.106323
Train - Epoch 1, Batch: 7, Loss: 2.310294, Time 0.106855
Traceback (most recent call last):
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 146, in
main()
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 142, in main
train_and_test(e)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 135, in train_and_test
train(epoch)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 90, in train
output = net(images)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 83, in forward
x = self.trans3(self.dense3(x))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 17, in forward
y = self.conv1(func.relu(self.bn1(x)))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 104, in forward
output = adder2d_function(x, self.adder, self.stride, self.padding)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 39, in adder2d_function
out = out.permute(3, 0, 1, 2).contiguous()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
The text was updated successfully, but these errors were encountered:
每次跑完8个batch后出现这个问题,改batchsize没用,都是8个batch后报错。
机器装了3块GPU,设置的GPU_ID = 1
Files already downloaded and verified
Files already downloaded and verified
Train - Epoch 1, Batch: 0, Loss: 2.296886, Time 5.307902
Train - Epoch 1, Batch: 1, Loss: 2.301040, Time 0.105161
Train - Epoch 1, Batch: 2, Loss: 2.300776, Time 0.110913
Train - Epoch 1, Batch: 3, Loss: 2.303986, Time 0.104652
Train - Epoch 1, Batch: 4, Loss: 2.289750, Time 0.100140
Train - Epoch 1, Batch: 5, Loss: 2.315252, Time 0.099318
Train - Epoch 1, Batch: 6, Loss: 2.298506, Time 0.106323
Train - Epoch 1, Batch: 7, Loss: 2.310294, Time 0.106855
Traceback (most recent call last):
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 146, in
main()
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 142, in main
train_and_test(e)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 135, in train_and_test
train(epoch)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 90, in train
output = net(images)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 83, in forward
x = self.trans3(self.dense3(x))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 17, in forward
y = self.conv1(func.relu(self.bn1(x)))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 104, in forward
output = adder2d_function(x, self.adder, self.stride, self.padding)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 39, in adder2d_function
out = out.permute(3, 0, 1, 2).contiguous()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
The text was updated successfully, but these errors were encountered: