THCudaCheck FAIL #12

cpietsch · 2019-04-12T22:46:12Z

I am running on an rtx 2080 with cuda 10.1.

Does this mean one rtx 2080 is not enough ?

python test.py --name coco_pretrained --dataset_mode coco --dataroot '/home/chrispie/projects/SPADE/datasets/coco_stuff' 

....
dataset [CocoDataset] of size 8 was created
Network [SPADEGenerator] was created. Total number of parameters: 97.5 million. To see the architecture, do print(network).
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
Traceback (most recent call last):
  File "test.py", line 36, in <module>
    generated = model(data, mode='inference')
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 58, in forward
    fake_image, _ = self.generate_fake(input_semantics, real_image)
  File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 197, in generate_fake
    fake_image = self.netG(input_semantics, z=z)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/networks/generator.py", line 91, in forward
    x = self.head_0(x, seg)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/networks/architecture.py", line 60, in forward
    dx = self.conv_0(self.actvn(self.norm_0(x, seg)))
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 485, in __call__
    hook(self, input)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 100, in __call__
    setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 86, in compute_weight
    sigma = torch.dot(u, torch.mv(weight_mat, v))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:116

The text was updated successfully, but these errors were encountered:

banyet1 · 2019-04-13T11:43:56Z

It's working flawlessly on my 2080ti, CUDA 10.1, tensorflow=1.12.

cpietsch · 2019-04-13T12:21:42Z

Thx, probably I have messed up my installation. What Ubuntu are you running ?

banyet1 · 2019-04-13T14:37:18Z

Ubuntu 18.04.2 LTS, and I'm using Anaconda.
Good luck!

taesungp · 2019-04-14T00:21:14Z

I will close this issue as it looks like an issue outside our scope.

taesungp closed this as completed Apr 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

THCudaCheck FAIL #12

THCudaCheck FAIL #12

cpietsch commented Apr 12, 2019 •

edited

Loading

banyet1 commented Apr 13, 2019

cpietsch commented Apr 13, 2019

banyet1 commented Apr 13, 2019

taesungp commented Apr 14, 2019

THCudaCheck FAIL #12

THCudaCheck FAIL #12

Comments

cpietsch commented Apr 12, 2019 • edited Loading

banyet1 commented Apr 13, 2019

cpietsch commented Apr 13, 2019

banyet1 commented Apr 13, 2019

taesungp commented Apr 14, 2019

cpietsch commented Apr 12, 2019 •

edited

Loading