Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for T5 transformer models. #10613

Open
VikasOjha666 opened this issue Feb 21, 2022 · 7 comments
Open

Optimization for T5 transformer models. #10613

VikasOjha666 opened this issue Feb 21, 2022 · 7 comments
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Comments

@VikasOjha666
Copy link

Is your feature request related to a problem? Please describe.
No, it's not a problem but a feature request

System information

  • ONNX Runtime version (you are using): 1.9

Describe the solution you'd like
As far as now the layer fusion-based optimization is available for BERT, GPT2, BART, etc. But it's not available for T5. Hence it would be good if it was implemented for T5 as well because the T5 model is getting quite popular.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@wangyems wangyems added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Feb 23, 2022
@wangyems
Copy link
Member

Will add to our backlog.

@ierezell
Copy link

ierezell commented Jun 6, 2022

Hello,

After using hugging face optimum, I found that it will soon be possible to do some seq2seq (huggingface/optimum#199). It works great without optimization, but to optimize T5 models we would need an onnxruntime/transformers/onnx_model_XXX.py which is missing the T5 one.

(More details on their forum: https://discuss.huggingface.co/t/optimum-t5-for-inference/16695/5)

Do you have any status about that?
I would be able to spend some time on that if needed.

Thanks in advance,
Have a great day

@tianleiwu
Copy link
Contributor

@Ierezell, optimization of T5 model are planned (likely in 1.13 release).
Contributions are welcome.

@p-christ
Copy link

did this happen? i'm still seeing this message

KeyError: "ONNX Runtime doesn't support the graph optimization of t5 yet. Only ['bert', 'gpt2', 'bart'] are supported. If you want to support t5 please propose a PR or open up an issue in ONNX Runtime:https://github.com/microsoft/onnxruntime."

@tianleiwu
Copy link
Contributor

@wangyems, could you give update of T5 optimizations?

@giantvision
Copy link

giantvision commented Sep 28, 2023

Can anyone tell me the progress of the inclusion of T5 into ORTOptimizer/ ORTQuantizer ?

@tianleiwu
Copy link
Contributor

The T5 optimizer is completed. Try the following to generate an optimized fp16 model:

 python -m onnxruntime.transformers.t5.convert_to_onnx -m t5-small --output ./onnx -o --use_gpu -p fp16

You can also try beam search optimization with T5:

python -m onnxruntime.transformers.convert_generation -m t5-small --model_type t5 --output t5_small_beam_search.onnx          --use_gpu --past_present_share_buffer --use_decoder_masked_attention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Projects
None yet
Development

No branches or pull requests

6 participants