Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastertransformer speed slower than pytorch #325

Open
lucasjinreal opened this issue Sep 21, 2022 · 3 comments
Open

fastertransformer speed slower than pytorch #325

lucasjinreal opened this issue Sep 21, 2022 · 3 comments

Comments

@lucasjinreal
Copy link

I am runing on vit got unexpected result:

FP32 op time :  2464.206495285034 ms
FP32 torch time :  2419.6650743484497 ms

it's even slower than pytorch.....

@byshiue
Copy link
Collaborator

byshiue commented Sep 22, 2022

It is a possible case because GEMM takes almost all time under FP32.

In such case, small noise of time of GEMM may affect the latency obviously. In your case, the relative difference of latency is about 2%, which may be a noise. For such cases, FT and pytorch should have similar latency.

We don't suggest using FP32 for transformer model because FP16 can bring lots of speedup without accuracy drop.

@lucasjinreal
Copy link
Author

@byshiue I found I didn't search GEMM info which caused using default gemm. Does searching a best algo will boost time a little bit? Does this gemm info file can cross different PC with same GPU card model?

@byshiue
Copy link
Collaborator

byshiue commented Dec 2, 2022

Sorry for delay reply. Searching best algo may improve the speed. It is case by case.
In general, the gemm info file can used in different devices with same GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants