-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code for training llama pro? #2
Comments
Yes, of course. I will organize the code recently. Thanks for your interest. |
Hey @hills-code ,could you also add code for converting a model by adding identity blocks for training ? |
I have added the block expansion script under the folder scripts. You can check it for reference. Hope it will be helpful! |
i will check that out,Thanks! |
@hills-code i was able to get it to work! Btw you only train the added layers and not even the lm head? |
@raghavgarg97 No, I don't think you can skip the "post-training" for extra blocks on some new corpus (e.g. bigcode, as mentioned in the paper) before applying tuning bc the purpose of LLaMA Pro is sort of "add new ability w/o forgetting by introducing more layers". The new ability gained is training new layers on new corpus. |
does anybody know how to continue pre-training LLaMA Pro? @yhyu13 @raghavgarg97 @yxgeee @hills-code |
nah,code has not been released yet |
Hi,
If I get it correctly, you have used code from https://github.com/allenai/open-instruct as base.
Would you release the full code of reproducing llama2 pro 8B?
Thanks!
The text was updated successfully, but these errors were encountered: