-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🦅 Phi2 Fine tune example #1030
🦅 Phi2 Fine tune example #1030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the adapter merge optional.
@@ -35,7 +53,7 @@ Above commands will generate optimized models with given model_type and save the | |||
Besides, for better generation experience, this example also let use use [Optimum](https://huggingface.co/docs/optimum/v1.2.1/en/onnxruntime/modeling_ort) to generate optimized models. | |||
Then use can call `model.generate` easily to run inference with the optimized model. | |||
```bash | |||
# optimum optimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't support it, should we remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the model that did not be fine tuned, optimum works well.
if args.finetune_method: | ||
pass_flows[0].append(args.finetune_method) | ||
template_json["systems"]["local_system"]["config"]["accelerators"][0]["device"] = "gpu" | ||
# torch fine tuning does not require execution provider, just set it to CUDAExecutionProvider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought ep is not mandatorily required after Mike's PR got merged. Is it still required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't provide an EP, it would loop over the installed EPs which is not what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jambayk will "loop over the installed eps" sill be executed even if we are not running any pass needed onnxruntime like lora/snpe/openvino?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we haven't made the changes that would not do this for workflows with no ort targeting passes.
Mike's PR didn't really change any behavior of the workflows. It only updated the configs to collect the hardware/ep related options together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the qlora/snpe/openvino pass is EP agnostic, so even with "loop over the installed eps", the pass only run once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache does take care of the rerun but still, the looping behavior and multiple footprint/outputs for workflows with no onnx models is not ideal. That's what will be good to improve.
@@ -59,9 +59,17 @@ def get_args(raw_args): | |||
parser.add_argument( | |||
"--model_type", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if model_type
and finetune_method
are None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by raising error.
…/phi2_fine_tune
Describe your changes
torch_dtype
asfloat32
to avoid the inconsistent weights shape(adapters bf16, but base model with fp32)Checklist before requesting a review
lintrunner -a
(Optional) Issue link