Crash in cublasGemmEx on Titan RTX 24GB #65

zoanthal · 2023-03-02T17:50:18Z

Hi all,
I am attempting to run the example.py script on a Titan RTX 24GB. The model loads fine with max_batch_size = 1 and only one prompt, but get the following error message. Any assistance would be helpful.

Per nvidia-smi
NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1

Error:
File "/llamapath/llama/example.py", line 73, in <module> fire.Fire(main) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/llamapath/llama/example.py", line 65, in main results = generator.generate(prompts, max_gen_len=256, temperature=temperature, top_p=top_p) File "/llamapath/llama/llama/generation.py", line 42, in generate logits = self.model.forward(tokens[:, prev_pos:cur_pos], prev_pos) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/llamapath/llama/llama/model.py", line 235, in forward h = layer(h, start_pos, freqs_cis, mask) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/llamapath/llama/llama/model.py", line 193, in forward h = x + self.attention.forward(self.attention_norm(x), start_pos, freqs_cis, mask) File "/llamapath/llama/llama/model.py", line 121, in forward xq, xk, xv = self.wq(x), self.wk(x), self.wv(x) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/llamapath/anaconda3/envs/llamaconda/lib/python3.9/site-packages/fairscale/nn/model_parallel/layers.py", line 290, in forward output_parallel = F.linear(input_parallel, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when callingcublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)

The text was updated successfully, but these errors were encountered:

zoanthal · 2023-03-02T21:57:10Z

Credit to:
https://stackoverflow.com/questions/74137511/runtimeerror-cuda-error-cublas-status-invalid-value-on-pytorch-lightning

I created a new conda environment with python=3.10, installed pytorch, performed the requirements and install steps, and now magically working.

zoanthal closed this as completed Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash in cublasGemmEx on Titan RTX 24GB #65

Crash in cublasGemmEx on Titan RTX 24GB #65

zoanthal commented Mar 2, 2023 •

edited

zoanthal commented Mar 2, 2023

Crash in cublasGemmEx on Titan RTX 24GB #65

Crash in cublasGemmEx on Titan RTX 24GB #65

Comments

zoanthal commented Mar 2, 2023 • edited

zoanthal commented Mar 2, 2023

zoanthal commented Mar 2, 2023 •

edited