-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
后面有计划开源finetuning的代码吗,以及会尝试LoRA吗 #3
Comments
我们正在整理代码,后续会开源。目前很多开源的finetune decoder-only模型的代码就可以用于finetune我们的模型,如果你急着训练,可以先用这些代码,同时把其中的checkpoint替换成我们的开源模型即可。 我们目前没有尝试LoRA的计划。 |
lora的我跑过,不是很行,模型一大,loss就急剧变为0,eval loss是nan满全场 |
感谢回复!想问下训练bloom7B用了多少卡 |
@TccccD 我自己复现了stanford的训练,4*A100 1个半小时 |
是指用stanford的训练方式训练Bloom7B吗,还是LLaMA |
是5万条英文prompt数据那个吧?这个还是有可能的 |
Hi, 感谢作者对于数据和模型的开源。 全模型微调和lora脚本可以参考: https://github.com/feizc/MLE-LLaMA |
No description provided.
The text was updated successfully, but these errors were encountered: