New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization for T5 transformer models. #10613
Comments
Will add to our backlog. |
Hello, After using hugging face (More details on their forum: https://discuss.huggingface.co/t/optimum-t5-for-inference/16695/5) Do you have any status about that? Thanks in advance, |
@Ierezell, optimization of T5 model are planned (likely in 1.13 release). |
did this happen? i'm still seeing this message
|
@wangyems, could you give update of T5 optimizations? |
Can anyone tell me the progress of the inclusion of T5 into ORTOptimizer/ ORTQuantizer ? |
The T5 optimizer is completed. Try the following to generate an optimized fp16 model:
You can also try beam search optimization with T5:
|
Is your feature request related to a problem? Please describe.
No, it's not a problem but a feature request
System information
Describe the solution you'd like
As far as now the layer fusion-based optimization is available for BERT, GPT2, BART, etc. But it's not available for T5. Hence it would be good if it was implemented for T5 as well because the T5 model is getting quite popular.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: