-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Closed
Labels
module: onnxRelated to torch.onnxRelated to torch.onnxtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone
Description
🚀 The feature, motivation and pitch
ONNX have introduced MHA operator in opset 23 (https://onnx.ai/onnx/operators/onnx__Attention.html#l-onnx-op-attention-23). This could be used when exporting scaled_dot_product_attention to ONNX format. Currently the scaled_dot_product_attention gets broken down to constituent ops when exporting to ONNX which complicates the model and makes it harder to identify the attention block when compiling the network for inference in custom HW backends.
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
module: onnxRelated to torch.onnxRelated to torch.onnxtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module