Optimization for T5 transformer models. #10613

VikasOjha666 · 2022-02-21T11:07:28Z

Is your feature request related to a problem? Please describe.
No, it's not a problem but a feature request

System information

ONNX Runtime version (you are using): 1.9

Describe the solution you'd like
As far as now the layer fusion-based optimization is available for BERT, GPT2, BART, etc. But it's not available for T5. Hence it would be good if it was implemented for T5 as well because the T5 model is getting quite popular.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

wangyems · 2022-02-23T20:01:01Z

Will add to our backlog.

ierezell · 2022-06-06T14:27:05Z

Hello,

After using hugging face optimum, I found that it will soon be possible to do some seq2seq (huggingface/optimum#199). It works great without optimization, but to optimize T5 models we would need an onnxruntime/transformers/onnx_model_XXX.py which is missing the T5 one.

(More details on their forum: https://discuss.huggingface.co/t/optimum-t5-for-inference/16695/5)

Do you have any status about that?
I would be able to spend some time on that if needed.

Thanks in advance,
Have a great day

tianleiwu · 2022-06-22T06:33:15Z

@Ierezell, optimization of T5 model are planned (likely in 1.13 release).
Contributions are welcome.

p-christ · 2023-02-22T09:42:25Z

did this happen? i'm still seeing this message

KeyError: "ONNX Runtime doesn't support the graph optimization of t5 yet. Only ['bert', 'gpt2', 'bart'] are supported. If you want to support t5 please propose a PR or open up an issue in ONNX Runtime:https://github.com/microsoft/onnxruntime."

tianleiwu · 2023-02-24T21:01:59Z

@wangyems, could you give update of T5 optimizations?

giantvision · 2023-09-28T05:41:57Z

Can anyone tell me the progress of the inclusion of T5 into ORTOptimizer/ ORTQuantizer ?

tianleiwu · 2023-10-03T04:17:11Z

The T5 optimizer is completed. Try the following to generate an optimized fp16 model:

 python -m onnxruntime.transformers.t5.convert_to_onnx -m t5-small --output ./onnx -o --use_gpu -p fp16

You can also try beam search optimization with T5:

python -m onnxruntime.transformers.convert_generation -m t5-small --model_type t5 --output t5_small_beam_search.onnx          --use_gpu --past_present_share_buffer --use_decoder_masked_attention

wangyems added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization for T5 transformer models. #10613

Optimization for T5 transformer models. #10613

VikasOjha666 commented Feb 21, 2022

wangyems commented Feb 23, 2022

ierezell commented Jun 6, 2022

tianleiwu commented Jun 22, 2022

p-christ commented Feb 22, 2023

tianleiwu commented Feb 24, 2023

giantvision commented Sep 28, 2023 •

edited

tianleiwu commented Oct 3, 2023

Optimization for T5 transformer models. #10613

Optimization for T5 transformer models. #10613

Comments

VikasOjha666 commented Feb 21, 2022

wangyems commented Feb 23, 2022

ierezell commented Jun 6, 2022

tianleiwu commented Jun 22, 2022

p-christ commented Feb 22, 2023

tianleiwu commented Feb 24, 2023

giantvision commented Sep 28, 2023 • edited

tianleiwu commented Oct 3, 2023

giantvision commented Sep 28, 2023 •

edited