Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FT][ERROR] CUDA runtime error: CUBLAS_STATUS_EXECUTION_FAILED #411

Closed
gongel opened this issue Jan 1, 2023 · 5 comments
Closed

[FT][ERROR] CUDA runtime error: CUBLAS_STATUS_EXECUTION_FAILED #411

gongel opened this issue Jan 1, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@gongel
Copy link

gongel commented Jan 1, 2023

Branch/Tag/Commit

nccl_dependent_refine

Docker Image Version

.

GPU name

V100

CUDA Driver

CUDA 10.1

Reproduced Steps

  1. Background: Add XGLM model to FT(similar to GPT), and GPT model did not raise the error.
  2. The main error message
    For fp16, I report an error in this line of code: https://github.com/NVIDIA/FasterTransformer/blob/nccl_dependent_refine/fastertransformer/open_decoder.h#L735; For fp32, there is no problem.
  3. Search the same issue
    The error is same as CUDA runtime error: CUBLAS_STATUS_EXECUTION_FAILED for fp16 #181, but the weights of model are pretrained, I can not modify it.

Thank you very much for helping me.

@gongel gongel added the bug Something isn't working label Jan 1, 2023
@byshiue
Copy link
Collaborator

byshiue commented Jan 2, 2023

Can you try the GPT model of latest main branch? FT has supported XGLM and have some example in https://github.com/triton-inference-server/fastertransformer_backend/blob/main/docs/gpt_guide.md#run-xglm

@gongel
Copy link
Author

gongel commented Jan 2, 2023

Thanks for your reply. GPT model did not encounter the error in latest main branch and nccl_dependent_refine branch. But for fp16 in nccl_dependent_refine branch, XGLM will raise the error. " FT has supported XGLM " is great! Where is FT XGLM model code? I can't find it and I would appreciate it if you could provide me with the code. I will compare it with GPT model.

@byshiue
Copy link
Collaborator

byshiue commented Jan 2, 2023

XGLM shares same codes of GPT. It is supported in ParallelGpt class.

@gongel
Copy link
Author

gongel commented Jan 3, 2023

ok thanks

@gongel
Copy link
Author

gongel commented Jan 5, 2023

Fixed. Do 'bias_pad' will raise this error when in fp16 mode.

@gongel gongel closed this as completed Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants