-
Notifications
You must be signed in to change notification settings - Fork 66
Extend optimize_for_ort to cover passes #2274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Extend optimize_for_ort to cover passes #2274
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2274 +/- ##
==========================================
- Coverage 73.76% 73.75% -0.02%
==========================================
Files 239 239
Lines 30904 30907 +3
Branches 3494 3494
==========================================
- Hits 22797 22796 -1
- Misses 6907 6911 +4
Partials 1200 1200 ☔ View full report in Codecov by Sentry. |
Please also consider whether this method should be optimize in-place or not. I think we can make it in-place now that shape-inference itself is in-place. |
I think making it out-of-place is safer, in case we have passes in the future that need to be functional? |
# https://github.com/microsoft/onnxruntime/blob/74dcf7e296639095dfa55d31336998b6f719ed76/onnxruntime/python/tools/transformers/dynamo_onnx_helper.py#L172 | ||
common_passes.ClearMetadataAndDocStringPass(), | ||
# https://github.com/microsoft/onnxruntime/blob/74dcf7e296639095dfa55d31336998b6f719ed76/onnxruntime/python/tools/transformers/dynamo_onnx_helper.py#L139 | ||
common_passes.LiftConstantsToInitializersPass(lift_all_constants=False, size_limit=1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have another pass called LiftSubgraphInitializersToMainGraphPass
. Do we know if it's needed in genAI? @kunal-vaishnavi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the pass logic is in DynamoOnnxHelper
, then it is used for ONNX Runtime GenAI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really produce graphs with subgraph initializers. I think we are ok either way
@@ -135,4 +135,18 @@ def optimize_for_ort( | |||
) | |||
# Apply the ORT pattern rewrite rules. | |||
rewrite(model, ORT_PATTERN_REWRITE_RULES) | |||
return model, fusion_count | |||
|
|||
passes = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passes = [ | |
passes = ir.passes.Sequential( |
] | ||
optimize_for_ort_passes = ir.passes.Sequential(*passes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
] | |
optimize_for_ort_passes = ir.passes.Sequential(*passes) | |
) |
I will set up Whisper and test it before merge this PR microsoft/onnxruntime#24382 |
Fix #2261
A draft for discussion. We should cover all post-processing the model shipping needs