-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with latency #113
Comments
I've yet to use the library - just looked at the documentation. Could be something along the lines of:
Note the |
Hello @PetroMaslov, For using ONNX with PEFT LoRA model, please refer to the example in the PR #118. Let us know if that solves the issue. |
@pacman100 |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
@PetroMaslov after converting the LoRA finetuned T5-large model to ONNX, did you see degraded model outputs? Using a bigscience/bloom base model, I can perform inference after exporting to ONNX, but the model predictions become nonsensical 🤔 #670 |
Hi!
I trained t5-large using LoRA config:
I saved the model using the next part of code:
model.save_pretrained(path)
Have 2 questions:
Maybe I need to save the model in another way?
Could you please help me to understand what I do wrong?
The text was updated successfully, but these errors were encountered: