Skip to content

phi2 conversion/optimization script#19338

Merged
gh-yewang merged 49 commits intomainfrom
wangye/phi2_doc
Feb 5, 2024
Merged

phi2 conversion/optimization script#19338
gh-yewang merged 49 commits intomainfrom
wangye/phi2_doc

Conversation

@gh-yewang
Copy link
Copy Markdown
Contributor

@gh-yewang gh-yewang commented Jan 30, 2024

Description

This PR adds
onnx conversion script for dynamo exported phi2,
optimization script,
and inference example script

A readme file is added as documentation. https://github.com/microsoft/onnxruntime/tree/wangye/phi2_doc/onnxruntime/python/tools/transformers/models/phi2#readme

Motivation and Context

@gh-yewang gh-yewang marked this pull request as ready for review January 30, 2024 22:24
Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Comment thread onnxruntime/python/tools/symbolic_shape_infer.py Outdated
Comment thread onnxruntime/python/tools/transformers/dynamo_onnx_helper.py Fixed
Comment thread onnxruntime/python/tools/transformers/dynamo_onnx_helper.py Fixed
Comment thread onnxruntime/python/tools/transformers/models/phi2/README.md Outdated
Comment thread onnxruntime/python/tools/transformers/models/phi2/README.md
@gh-yewang
Copy link
Copy Markdown
Contributor Author

TODO: add an option to export a model for vllm

Comment thread onnxruntime/python/tools/transformers/models/phi2/README.md Outdated
Comment thread onnxruntime/python/tools/transformers/models/phi2/requirements.txt Outdated

def unroll_function(self, func_name: str) -> None:
"""
Unrolls the function with the given name in the model.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done with the onnx inliner? https://onnx.ai/onnx/api/inliner.html#inline-selected-functions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately no

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there any limitations? Could you share more? Thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first, the function unrolling is semantically different from inliner. the former does not inline the function recursively. second, I played with inliner initially, but the output onnx file was invalid. you can take a try with the original model exported from dynamo

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. the first point: I guess you mean you want to do selective inlining (may be just a single function)? (The inliner API does support that.) Re. the second point: where can I find the original model? Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onnx inliner has an argument function_ids to specify which function to inline by function (domain, name). For the invalid output onnx issue, did you try with latest onnx release? @wangyems

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BowenBao I tried the function_ids at the very beginning a few weeks ago but let me try with the main branch
@gramalingam you can get the original file by running convert_to_onnx.py without argument https://github.com/microsoft/onnxruntime/tree/wangye/phi2_doc/onnxruntime/python/tools/transformers/models/phi2#readme.

@gh-yewang gh-yewang requested a review from tianleiwu February 2, 2024 21:07
Comment thread onnxruntime/python/tools/transformers/models/phi2/README.md
@gh-yewang gh-yewang merged commit aaf32fb into main Feb 5, 2024
@gh-yewang gh-yewang deleted the wangye/phi2_doc branch February 5, 2024 18:15
gh-yewang added a commit that referenced this pull request Feb 8, 2024
### Description
<!-- Describe your changes. -->

1. add option to export onnx compatiable with ort_vllm. This makes sure
that onnx model only leverages on paged attn from vllm. It's intended to
use internally so not mentioned in readme.
2. add details in ORT
installation(#19338 (comment))


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@163.com>
skottmckay pushed a commit that referenced this pull request Feb 15, 2024
<!-- Describe your changes. -->

1. add option to export onnx compatiable with ort_vllm. This makes sure
that onnx model only leverages on paged attn from vllm. It's intended to
use internally so not mentioned in readme.
2. add details in ORT
installation(#19338 (comment))

<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@163.com>
rohan11235813 pushed a commit to quadric-io/onnxruntime that referenced this pull request Aug 19, 2025
### Description
<!-- Describe your changes. -->

1. add option to export onnx compatiable with ort_vllm. This makes sure
that onnx model only leverages on paged attn from vllm. It's intended to
use internally so not mentioned in readme.
2. add details in ORT
installation(microsoft/onnxruntime#19338 (comment))


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@163.com>
rohan11235813 pushed a commit to quadric-io/onnxruntime that referenced this pull request Sep 15, 2025
### Description
<!-- Describe your changes. -->

1. add option to export onnx compatiable with ort_vllm. This makes sure
that onnx model only leverages on paged attn from vllm. It's intended to
use internally so not mentioned in readme.
2. add details in ORT
installation(microsoft/onnxruntime#19338 (comment))


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: wejoncy <wejoncy@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants