-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor results when fine-tuning with alpaca_data.json and suggested settings.
#326
Comments
alpaca_data.json and suggested settings.
|
Do we have any randomness in the training since I did not see any code snippets that control randomness , such as |
|
1 single epoch is too low. Current params with 4 lora modules and rank 16 use closer to 10 epochs. Also, better use a cleaned alpaca dataset instead of the legacy one. |
|
Can I ask what four modules are used for the current "meta" parameters? |
|
See just below --group_by_length however I do not recommend. |
Thanks! |
|
The reason why it generated "### instruction" is because your fine-tuning is inefficient. In this case, we put a eos_token_id=2 into the tensor for each instance before fine-tune, at least your model weights need to remember when to generate "2" at the end of the output. For your example 1, your model was actually looping the instruction and response until it reached the max tokens, and the "return output.split(self.template["response_split"])[1].strip()" in utils/promper.py helps to cut the rest, but it still remained a "### instruction ....". |
i met the same problem, so just train more epoches? |
I also met this issue and I'm using the command in the readme. Th problem may not be inadequate training?
|
Checklist: 3.Update the finetune.py file, add_eos_token=Ture. |
Thanks for the quick reply! The problem was solved. #293 |
|
Thanks for all your responses. They are all useful for training. |
@lywinged may I ask why use old version of peft? What is causing problem with the latest peft? Thank you so much. |
Because they had a bug about making the adaptor model be overwritten and cleared after training, but they might fix it now. |
Env Settings
Running scripts:
Training log
Inference Results:
In example1 , some unrelated texts, which are
### Instruction: Tell me about camels., are generated.In example 2, the result is related but may not be incorrect.
The text was updated successfully, but these errors were encountered: