Skip to content

Create a constant to initializer pass #2156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
justinchuby opened this issue Apr 1, 2025 · 3 comments · Fixed by #2160
Closed

Create a constant to initializer pass #2156

justinchuby opened this issue Apr 1, 2025 · 3 comments · Fixed by #2160
Assignees

Comments

@justinchuby
Copy link
Collaborator

justinchuby commented Apr 1, 2025

https://gist.github.com/justinchuby/cf1699d05baeac281fb3e82f9d0fc473

@justinchuby justinchuby changed the title Const to initializer and back https://gist.github.com/justinchuby/cf1699d05baeac281fb3e82f9d0fc473 Create a constant to initializer pass Apr 1, 2025
@justinchuby
Copy link
Collaborator Author

Maybe call it constant manipulation or something

@titaiwangms
Copy link
Contributor

If we make this pass, should we have it before model.optimze() or after model.optimize() in the middle of exporting?
https://github.com/pytorch/pytorch/blob/203a27e0cecce5b9050218c9d6a56c8cd2ebd0a5/torch/onnx/_internal/exporter/_compat.py#L184

Where we usually do graph pass is in _core.py (before optimize):
https://github.com/pytorch/pytorch/blob/203a27e0cecce5b9050218c9d6a56c8cd2ebd0a5/torch/onnx/_internal/exporter/_core.py#L1161

@justinchuby
Copy link
Collaborator Author

it should be called in optimize not PyTorch I think. It’s a pass to mend the behavior of optimize()

@titaiwangms titaiwangms linked a pull request Apr 2, 2025 that will close this issue
titaiwangms added a commit to microsoft/onnxruntime that referenced this issue Apr 4, 2025
### Description
<!-- Describe your changes. -->

Essentially, the vision model is traced differently (this time it's
without mask.), and the input indices of op.Add and op.MatMul can be
different. Also, fp16 and fp32 need different tracing patterns
(op.Cast).

1. Add another traced pattern to CLIP attention to cover no
attention_mask case
2. Accept different index of input on op.Add and op.MatMul (be more
general)
3. fp16 and fp32 shows different pattern (op.Cast after op.Softmax)
4. Refactor test_fastgelu.py to cover torch.onnx.export(...,
dynamo=True)
5. Add gemma3 vision attention (SigLip) test to cover both fp16 and fp32

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

To optimize Gemma3 multi-modal model, the changes are needed.
https://huggingface.co/google/gemma-3-4b-it

NOTE: some related follow-ups (upstream optimizations to
onnxscript-optimizer):
microsoft/onnxscript#2158
microsoft/onnxscript#2156
justinchuby added a commit that referenced this issue Apr 10, 2025
Fix #2156

---------

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
quic-zhaoxul pushed a commit to CodeLinaro/onnxruntime that referenced this issue Apr 17, 2025
### Description
<!-- Describe your changes. -->

Essentially, the vision model is traced differently (this time it's
without mask.), and the input indices of op.Add and op.MatMul can be
different. Also, fp16 and fp32 need different tracing patterns
(op.Cast).

1. Add another traced pattern to CLIP attention to cover no
attention_mask case
2. Accept different index of input on op.Add and op.MatMul (be more
general)
3. fp16 and fp32 shows different pattern (op.Cast after op.Softmax)
4. Refactor test_fastgelu.py to cover torch.onnx.export(...,
dynamo=True)
5. Add gemma3 vision attention (SigLip) test to cover both fp16 and fp32

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

To optimize Gemma3 multi-modal model, the changes are needed.
https://huggingface.co/google/gemma-3-4b-it

NOTE: some related follow-ups (upstream optimizations to
onnxscript-optimizer):
microsoft/onnxscript#2158
microsoft/onnxscript#2156
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants