CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #14

pwichmann · 2020-11-29T17:21:29Z

Hi Alexandre,

Many thanks for the paper and the code. Excellent work!

I am getting a CUDA error for which I have no immediate explanation. Maybe you have an idea (also in case of other users experiencing the same issue). Googling the error message resulted in some hits but none that I could work with.

Where does error occur?

notebooks/fonts.ipynb
Random font generation

Environment

Ubuntu 20.04
Cuda compilation tools, release 11.1, V11.1.105; Build cuda_11.1.TC455_06.29190527_0
Python 3.7 (installed requirements.txt)
GPU RTX 3090

Error message

Immediate code context:

z = get_z()

for c in glyph2label:
    sample_class(c, z=z, with_points=True, with_handles=True, with_moves=False)

Error message:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-14-5f29ff27738d> in <module>
      2 
      3 for c in glyph2label:
----> 4     sample_class(c, z=z, with_points=True, with_handles=True, with_moves=False)

<ipython-input-8-af5b7f4d729f> in sample_class(label, z, temperature, filename, do_display, return_svg, return_png, *args, **kwargs)
      6 
      7     label, = batchify((torch.tensor(label_id),), device=device)
----> 8     commands_y, args_y = model.greedy_sample(None, None, None, None, label=label, z=z)
      9     tensor_pred = SVGTensor.from_cmd_args(commands_y[0].cpu(), args_y[0].cpu())
     10 

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/model.py in greedy_sample(self, commands_enc, args_enc, commands_dec, args_dec, label, z, hierarch_logits, concat_groups, temperature)
    416                       concat_groups=True, temperature=0.0001):
    417         if self.cfg.pred_mode == "one_shot":
--> 418             res = self.forward(commands_enc, args_enc, commands_dec, args_dec, label=label, z=z, hierarch_logits=hierarch_logits, return_tgt=False)
    419             commands_y, args_y = _sample_categorical(temperature, res["command_logits"], res["args_logits"])
    420             args_y -= 1  # shift due to -1 PAD_VAL

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/model.py in forward(self, commands_enc, args_enc, commands_dec, args_dec, label, z, hierarch_logits, return_tgt, params, encode_mode, return_hierarch)
    375 
    376         out_logits = self.decoder(z, commands_dec_, args_dec_, label, hierarch_logits=hierarch_logits,
--> 377                                   return_hierarch=return_hierarch)
    378 
    379         if return_hierarch:

~/anaconda3/envs/deepsvg/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/model.py in forward(self, z, commands, args, label, hierarch_logits, return_hierarch)
    250             if hierarch_logits is None:
    251                 src = self.hierarchical_embedding(z)
--> 252                 out = self.hierarchical_decoder(src, z, tgt_mask=None, tgt_key_padding_mask=None, memory2=l)
    253                 hierarch_logits, z = self.hierarchical_fcn(out)
    254 

~/anaconda3/envs/deepsvg/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/layers/transformer.py in forward(self, tgt, memory, memory2, tgt_mask, memory_mask, tgt_key_padding_mask, memory_key_padding_mask)
    235                          memory_mask=memory_mask,
    236                          tgt_key_padding_mask=tgt_key_padding_mask,
--> 237                          memory_key_padding_mask=memory_key_padding_mask)
    238 
    239         if self.norm is not None:

~/anaconda3/envs/deepsvg/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/layers/improved_transformer.py in forward(self, tgt, memory, memory2, tgt_mask, tgt_key_padding_mask, *args, **kwargs)
    126     def forward(self, tgt, memory, memory2=None, tgt_mask=None, tgt_key_padding_mask=None, *args, **kwargs):
    127         tgt1 = self.norm1(tgt)
--> 128         tgt2 = self.self_attn(tgt1, tgt1, tgt1, attn_mask=tgt_mask, key_padding_mask=tgt_key_padding_mask)[0]
    129         tgt = tgt + self.dropout1(tgt2)
    130 

~/anaconda3/envs/deepsvg/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/layers/attention.py in forward(self, query, key, value, key_padding_mask, need_weights, attn_mask)
    159                 training=self.training,
    160                 key_padding_mask=key_padding_mask, need_weights=need_weights,
--> 161                 attn_mask=attn_mask)

~/Documents/q/04_deepsvg/deepsvg/deepsvg/model/layers/functional.py in multi_head_attention_forward(query, key, value, embed_dim_to_check, num_heads, in_proj_weight, in_proj_bias, bias_k, bias_v, add_zero_attn, dropout_p, out_proj_weight, out_proj_bias, training, key_padding_mask, need_weights, attn_mask, use_separate_proj_weight, q_proj_weight, k_proj_weight, v_proj_weight, static_k, static_v)
     90         if torch.equal(query, key) and torch.equal(key, value):
     91             # self-attention
---> 92             q, k, v = F.linear(query, in_proj_weight, in_proj_bias).chunk(3, dim=-1)
     93 
     94         elif torch.equal(key, value):

~/anaconda3/envs/deepsvg/lib/python3.7/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1370         ret = torch.addmm(bias, input, weight.t())
   1371     else:
-> 1372         output = input.matmul(weight.t())
   1373         if bias is not None:
   1374             output += bias

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

The text was updated successfully, but these errors were encountered:

pwichmann · 2020-11-29T20:01:39Z

The same problem occurs in notebooks/latent_ops.ipynb in the cell "Addition" operation.

z_list = []

for i in range(500):
    tensors, fillings = dataset._load_tensor(dataset.random_id())
    t_sep = tensors[0]
    t_sep_rm, fillings_rm = t_sep[:-1], fillings[:-1]

    if len(t_sep) >= 2:
        z1 = encode(dataset.get_data(t_sep, fillings))
        z2 = encode(dataset.get_data(t_sep_rm, fillings_rm))
        z_list.append(z2 - z1)
z_rmv = torch.cat(z_list).mean(dim=0, keepdims=True)

alexandre01 · 2020-11-29T20:04:32Z

Hi @pwichmann ,

I'm doing a fresh install right now to check if I get the same issue. Otherwise, will try to investigate the error message you are receiving.

pwichmann · 2020-11-29T20:05:35Z

Thanks, @alexandre01

One difference is the CUDA version. But I could not determine if this can be the cause.

alexandre01 · 2020-11-29T21:41:33Z

Okay, installed everything with the following configuration:

Ubuntu 20.04
CUDA 11.0
GPU GTX 1080

and worked without error... Trying to check your error message now

alexandre01 · 2020-11-29T22:05:37Z

Hmm.. Yeah, googling didn't help me much neither. I suspect this might be a problem with PyTorch 1.4 not supporting your exact CUDA build. It's tricky to give any advice here cause I always just install CUDA using Ubuntu's Software & Updates (additional drivers) tool. I don't have any RTX card neither..

pwichmann · 2020-11-30T14:28:51Z

I was able to solve the problem. It was NOT caused by your code but - indeed - by the PyTorch version or CUDA version.
A fresh install of PyTorch with the most recent version solved the problem.

pip install torch==1.7.0+cu110 torchvision==0.8.1+cu110 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

pwichmann · 2020-11-30T14:29:19Z

Thank you, @alexandre01!

alexandre01 · 2020-11-30T14:31:40Z

Wow great, I'm glad you managed to have it working!

pwichmann closed this as completed Nov 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #14

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #14

pwichmann commented Nov 29, 2020

pwichmann commented Nov 29, 2020 •

edited

alexandre01 commented Nov 29, 2020

pwichmann commented Nov 29, 2020

alexandre01 commented Nov 29, 2020

alexandre01 commented Nov 29, 2020

pwichmann commented Nov 30, 2020

pwichmann commented Nov 30, 2020

alexandre01 commented Nov 30, 2020

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #14

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #14

Comments

pwichmann commented Nov 29, 2020

Where does error occur?

Environment

Error message

pwichmann commented Nov 29, 2020 • edited

alexandre01 commented Nov 29, 2020

pwichmann commented Nov 29, 2020

alexandre01 commented Nov 29, 2020

alexandre01 commented Nov 29, 2020

pwichmann commented Nov 30, 2020

pwichmann commented Nov 30, 2020

alexandre01 commented Nov 30, 2020

pwichmann commented Nov 29, 2020 •

edited