New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which framework maximizes inference speed ? #877
Comments
Hello Simon, Thank you for your interest in CLIP-as-service. We currently do not have the exact comparison you are looking for, however we do have benchmarks of the different models. You can find them here. We are in the process of evaluating various inference frameworks, such as AITemplate/dynamo, to determine the most suitable one for each model. Unfortunately, we do not have enough hardware to test the metrics on different hardwares, and some frameworks do not perform well on certain hardware (AITemplate on V100). |
Thanks for your feedback, it fills a few cells in the matrix :) I may do
some benchmarks later I don't find the answer. I think TensorRT is the
fastest but need to check
Le sam. 17 déc. 2022, 11:44, Jie Fu ***@***.***> a écrit :
… For model comparison, we have a benchmark for this:
https://clip-as-service.jina.ai/user-guides/benchmark/
Also we are now working on different inference frameworks such as
AITemplate / dynamo to find the suitable one for CLIP-as-service.
Unfortunately we are not able to test the metrics on different hardwares
since: we don't have sufficient hardware to test it and also some framework
doesn't work well on specific hardware (AITemplate onV100).
Does this answer your question?
—
Reply to this email directly, view it on GitHub
<#877 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADE64VLHDL43NYBMKNBKDSLWNVK37ANCNFSM6AAAAAATA3X27Q>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hello,
Do you have somewhere a speed comparison of the different CLIP models with the following matrix ?
I am especially interested by the last dimension : which framework / compiler is currently the best to maximize speed of vision transformers ?
Thank you,
Simon
The text was updated successfully, but these errors were encountered: