You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using V100 gpu which has 16G memory. Here is the error log-
07/10 07:05:24 PM valid 000 2.609589e+00 47.656250 76.562500
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "train_imagenet.py", line 230, in <module>
main()
File "train_imagenet.py", line 152, in main
valid_acc_top1, valid_acc_top5, valid_obj = infer(valid_queue, model, criterion)
File "train_imagenet.py", line 214, in infer
logits, _ = model(input)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/workspace/darts/cnn/model.py", line 207, in forward
s0, s1 = s1, cell(s0, s1, self.drop_path_prob)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/workspace/darts/cnn/model.py", line 51, in forward
h1 = op1(h1)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/workspace/darts/cnn/operations.py", line 66, in forward
return self.op(x)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/workspace/.torch-env/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58
The text was updated successfully, but these errors were encountered:
Hard to tell without more details such as the pytorch version. If you use pytorch 0.4, be sure to wrap the validation scripts into torch.no_grad() as otherwise you would get OOM. I would also try smaller batch sizes and check the memory consumption during training & validation.
I am using V100 gpu which has 16G memory. Here is the error log-
The text was updated successfully, but these errors were encountered: