Create a constant to initializer pass #2156

justinchuby · 2025-04-01T16:52:29Z

https://gist.github.com/justinchuby/cf1699d05baeac281fb3e82f9d0fc473

justinchuby · 2025-04-01T16:55:29Z

Maybe call it constant manipulation or something

titaiwangms · 2025-04-01T23:21:20Z

If we make this pass, should we have it before model.optimze() or after model.optimize() in the middle of exporting?
https://github.com/pytorch/pytorch/blob/203a27e0cecce5b9050218c9d6a56c8cd2ebd0a5/torch/onnx/_internal/exporter/_compat.py#L184

Where we usually do graph pass is in _core.py (before optimize):
https://github.com/pytorch/pytorch/blob/203a27e0cecce5b9050218c9d6a56c8cd2ebd0a5/torch/onnx/_internal/exporter/_core.py#L1161

justinchuby · 2025-04-02T02:20:21Z

it should be called in optimize not PyTorch I think. It’s a pass to mend the behavior of optimize()

### Description  Essentially, the vision model is traced differently (this time it's without mask.), and the input indices of op.Add and op.MatMul can be different. Also, fp16 and fp32 need different tracing patterns (op.Cast). 1. Add another traced pattern to CLIP attention to cover no attention_mask case 2. Accept different index of input on op.Add and op.MatMul (be more general) 3. fp16 and fp32 shows different pattern (op.Cast after op.Softmax) 4. Refactor test_fastgelu.py to cover torch.onnx.export(..., dynamo=True) 5. Add gemma3 vision attention (SigLip) test to cover both fp16 and fp32 ### Motivation and Context  To optimize Gemma3 multi-modal model, the changes are needed. https://huggingface.co/google/gemma-3-4b-it NOTE: some related follow-ups (upstream optimizations to onnxscript-optimizer): microsoft/onnxscript#2158 microsoft/onnxscript#2156

Fix #2156 --------- Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>

### Description  Essentially, the vision model is traced differently (this time it's without mask.), and the input indices of op.Add and op.MatMul can be different. Also, fp16 and fp32 need different tracing patterns (op.Cast). 1. Add another traced pattern to CLIP attention to cover no attention_mask case 2. Accept different index of input on op.Add and op.MatMul (be more general) 3. fp16 and fp32 shows different pattern (op.Cast after op.Softmax) 4. Refactor test_fastgelu.py to cover torch.onnx.export(..., dynamo=True) 5. Add gemma3 vision attention (SigLip) test to cover both fp16 and fp32 ### Motivation and Context  To optimize Gemma3 multi-modal model, the changes are needed. https://huggingface.co/google/gemma-3-4b-it NOTE: some related follow-ups (upstream optimizations to onnxscript-optimizer): microsoft/onnxscript#2158 microsoft/onnxscript#2156

justinchuby changed the title ~~Const to initializer and back https://gist.github.com/justinchuby/cf1699d05baeac281fb3e82f9d0fc473~~ Create a constant to initializer pass Apr 1, 2025

justinchuby assigned titaiwangms Apr 1, 2025

titaiwangms mentioned this issue Apr 2, 2025

Support Gemma3 with Clip fused attention microsoft/onnxruntime#24280

Merged

titaiwangms linked a pull request Apr 2, 2025 that will close this issue

[Pass] Support lift constants to initializers pass #2160

Merged

titaiwangms mentioned this issue Apr 10, 2025

[Pass] Support lift constants to initializers pass #2160

Merged

justinchuby closed this as completed in #2160 Apr 10, 2025

justinchuby added a commit that referenced this issue Apr 10, 2025

[Pass] Support lift constants to initializers pass (#2160)

e659cb4

Fix #2156 --------- Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a constant to initializer pass #2156

Create a constant to initializer pass #2156

justinchuby commented Apr 1, 2025 •

edited

Loading

justinchuby commented Apr 1, 2025

titaiwangms commented Apr 1, 2025

justinchuby commented Apr 2, 2025

Create a constant to initializer pass #2156

Create a constant to initializer pass #2156

Comments

justinchuby commented Apr 1, 2025 • edited Loading

justinchuby commented Apr 1, 2025

titaiwangms commented Apr 1, 2025

justinchuby commented Apr 2, 2025

justinchuby commented Apr 1, 2025 •

edited

Loading