-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About inference speed compared to TRT16? [急急如律令] #490
Comments
+1 update |
time Update: fp32: fp16: fp16: batch=8 fp16: batch=128 fp16: batch=64 |
In conclusion, I have done all model result comparison between lightseq and pytorch implemetation by way of resolving a lot of implementation differences. It seems that the results is not fully compared as the author mentioned. And the result in long sequence large batch scenario is a bit of disappointing. Anyway, Thank for ur great work. |
Call it an end. |
Have u compared the ls fp16 with trt fp16?
It seems that trt fp16 is must faster than ls fp16 in VIT ? Is there something I've missing or being wrong with?
I’ve got some really disappointing results:
batch 128 image inference [vit-large]
--- huggingface fp32: 6203ms
--- ls fp32: 7408 ms [001 ? why slower than pure pytorch inference?]
--- trt fp16: 1924ms
--- ls fp16: 3701 ms [002 ? why much slower than TRT, where is the advantage of lightseq?]
GPU: T4
BTW: I compiled the lightseq whl in A100 with version 3.0.1 and use it by pip install on T4. Will it weaken the performance?
The text was updated successfully, but these errors were encountered: