New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the ability to choose the ONNX runtime execution provider in ORTModel
#137
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for this addition !
@@ -130,12 +130,16 @@ class ModelArguments: | |||
default=None, | |||
metadata={"help": "Where do you want to store the pretrained models downloaded from huggingface.co"}, | |||
) | |||
ort_provider: str = field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if ort_provider
should be an attribute of OptimizationArguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also when optimization_level > 1
, the optimized graph might be hardware dependent : optimize_for_gpu
and ort_provider
should be set accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@echarlaix What do you think of removing the argument in the examples and simply using
if optim_args.opt_level > 1 and optim_args.optimize_for_gpu:
execution_provider = "CUDAExecutionProvider"
else:
execution_provider = "CPUExecutionProvider"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could make sense to let the user select the execution provider (to have the same behavior as in the quantization examples + the user might want to use "CUDAExecutionProvider"
with optimization level of 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, makes sense. In this case, you suggest to leave ort_provider
as default CPUExecutionProvider
and leave it as a parameter, while we overwrite optim_args.optimize_for_gpu
to True
if optimization_level > 1
? Is this what you were suggesting above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposal is to let the user select the optimization parameters freely (and not overwrite anything) but to raise an error when those parameters are incompatible (in order to keep things as transparent and understandable as possible). What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, as you saw I added the following checks: https://github.com/fxmarty/optimum/blob/638e80e2bd8e86630162a964f98ae51d771fb407/examples/onnxruntime/optimization/text-classification/run_glue.py#L202-L222 . I hope it is not too heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks great, thanks for the addition !
What does this PR do?
A small PR to add the ability to choose the ONNX runtime execution provider in
ORTModel
.I did not change the
README.md
for readability. Also, I added theort_provider
parameter to theOptimizationArguments
in the examples, which is maybe subideal.