-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions related to training #11
Comments
Thanks for your interest in our work! The direct answer for your Q1 is YES. We found that the best way to train a pre-experienced model is to consider diversity. Thus we try to gain the embeddings for all the data and select by diversity. For the second question, I don't know what base model and what SFT data you use, so I can not give a definite answer. But I think in most situations, you don't need to modify it. |
Thank you for your reply! Because I saw the previous text saying "Learning from Brief Experience" by selecting a small amount of data, I'm not sure it's right to put all the data into it for training. I'll try it. Thank you. |
Ah, I am not sure if there is still a misunderstanding. For the pre-experienced model, it indeed only needs a small amount of data. The code "pre_experience_analysis.sh" you were asking for is not "put all the data into it for training", it just tries to select a suitable small amount of the data for training the pre-experienced model. |
Thank you for your reply. Maybe I'm not asking the question accurately. Is my understanding correct? Thank you again |
Yes, I think you are correct~ |
Thank you for sharing
I'm trying to train models using my Chinese SFT data. I have some questions as follows:
My first step is to run "pre_experience_analysis.sh", but it seems to run all my json data. Is that reasonable? It takes a long time. The "start_idx" and "end_idx" of “data_analysis.py” are not set in your code.
Do I need to modify the code for my own Chinese SFT data? Or just use it normally.
The text was updated successfully, but these errors were encountered: