Pass for Nvidia ModelOpt graph surgery framework#2377
Pass for Nvidia ModelOpt graph surgery framework#2377jambayk merged 6 commits intomicrosoft:mainfrom
Conversation
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
|
@jambayk Can you review this PR? This is graph surgery pass for NVIDIA stack. The implementation of the surgeries will be in modelopt and essentially we are calling them through the olive pass. |
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
6e92bd3 to
5b287d5
Compare
Pipeline is failing because graph_surgeon dependency is using onnx.helper.bfloat.float32_to_bfloat16 which has been deprecated in onnx since version 1.18 and removed in 1.20.
|
This is a conflict between onnx_graphsurgeon and onnx and needs to be resolved within onnx_graphsurgeon. I have raised a PR in ModelOpt for a workaround temporarily. NVIDIA/Model-Optimizer#1204. After it is merged, this failure should resolve itself |
|
The PR has been merged. Can you re review it again @jambayk ? |
tests need to be skipped until new release of modelopt due to incompatibility with latest onnx versions
|
@hthadicherla can you update this PR? we plan to release new Olive version this Friday and this PR will be in the new release |
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
2b09f13 to
ba3f2a7
Compare
|
@xiaoyu-work @jambayk I have added skip to the test. Can you reapprove it now ? |
Describe your changes
Add
NVModelOptGraphSurgerypass to integrate NVIDIA ModelOpt graph surgeries into Olive. Supports all existing surgeries in ModelOpt like GQA fusion, DQ-Transpose and all future surgeries that will be added in ModelOptChanges:
olive/passes/onnx/nvmo_graph_surgery.pyolive_config.jsontest/passes/onnx/test_nvmo_graph_surgery.pypass.rstandonnx-transformations.mdUsage
Example
Checklist before requesting a review
lintrunner -aRelease note: Added
NVModelOptGraphSurgerypass for running NVIDIA ModelOpt graph surgeries on ONNX models.