Conversation
…to wangye/phi2_doc
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
|
TODO: add an option to export a model for vllm |
|
|
||
| def unroll_function(self, func_name: str) -> None: | ||
| """ | ||
| Unrolls the function with the given name in the model. |
There was a problem hiding this comment.
Can this be done with the onnx inliner? https://onnx.ai/onnx/api/inliner.html#inline-selected-functions
There was a problem hiding this comment.
Was there any limitations? Could you share more? Thanks!
There was a problem hiding this comment.
first, the function unrolling is semantically different from inliner. the former does not inline the function recursively. second, I played with inliner initially, but the output onnx file was invalid. you can take a try with the original model exported from dynamo
There was a problem hiding this comment.
There was a problem hiding this comment.
Re. the first point: I guess you mean you want to do selective inlining (may be just a single function)? (The inliner API does support that.) Re. the second point: where can I find the original model? Thanks!
There was a problem hiding this comment.
onnx inliner has an argument function_ids to specify which function to inline by function (domain, name). For the invalid output onnx issue, did you try with latest onnx release? @wangyems
There was a problem hiding this comment.
@BowenBao I tried the function_ids at the very beginning a few weeks ago but let me try with the main branch
@gramalingam you can get the original file by running convert_to_onnx.py without argument https://github.com/microsoft/onnxruntime/tree/wangye/phi2_doc/onnxruntime/python/tools/transformers/models/phi2#readme.
### Description <!-- Describe your changes. --> 1. add option to export onnx compatiable with ort_vllm. This makes sure that onnx model only leverages on paged attn from vllm. It's intended to use internally so not mentioned in readme. 2. add details in ORT installation(#19338 (comment)) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: wejoncy <wejoncy@163.com>
<!-- Describe your changes. --> 1. add option to export onnx compatiable with ort_vllm. This makes sure that onnx model only leverages on paged attn from vllm. It's intended to use internally so not mentioned in readme. 2. add details in ORT installation(#19338 (comment)) <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: wejoncy <wejoncy@163.com>
### Description <!-- Describe your changes. --> 1. add option to export onnx compatiable with ort_vllm. This makes sure that onnx model only leverages on paged attn from vllm. It's intended to use internally so not mentioned in readme. 2. add details in ORT installation(microsoft/onnxruntime#19338 (comment)) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: wejoncy <wejoncy@163.com>
### Description <!-- Describe your changes. --> 1. add option to export onnx compatiable with ort_vllm. This makes sure that onnx model only leverages on paged attn from vllm. It's intended to use internally so not mentioned in readme. 2. add details in ORT installation(microsoft/onnxruntime#19338 (comment)) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: wejoncy <wejoncy@163.com>
Description
This PR adds
onnx conversion script for dynamo exported phi2,
optimization script,
and inference example script
A readme file is added as documentation. https://github.com/microsoft/onnxruntime/tree/wangye/phi2_doc/onnxruntime/python/tools/transformers/models/phi2#readme
Motivation and Context