-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about chatGPT three steps #2793
Comments
同问,我也想知道如何用ColossalAI实现ChatGPT的三步训练。 |
The same question, I also want to know how to use ColossalAI to implement the three-step training of ChatGPT. |
I think 'Train with dummy prompt data' is the 3rd step of chatGPT, |
我也有同样的问题,求指教 |
I have the same problem, please help |
I think 【train_prompts.py】 is the first step to train SFT, 【train_reward_models.py】 is the second step to train RM, |
thank you for your reply |
I think |
Looking at the paper, the first and second steps use prompt data, and the last step does not seem to require prompt data. I'm not sure either. In addition, do you know how to use the trained model for inference or deployment prediction? |
The As we can see by the figure in the paper, the third step uses the prompt data and the gpt3 model to generate some results, and then uses reinforcement learning to learn how to choose better responses. So I think that the third step is actually doing prompt training as well. As for the model training, I am also exploring it, and there is a lack of data at the moment. |
I think inference like GPT2, it also predicts word one by one, Then load last trained model can be inference like GPT2 |
ok,my qq:805650606 |
Thanks for your reply! |
Now that we need do the finetune(1st) step by ourselves. Do you know any finetune code that could be integrated into this project? |
Training with the Transformers framework is relatively simple, and there are plenty of tutorials on the web for fine-tuning, or you can refer to the official documentation |
Thank you for your feedback, and sorry about late reply. |
thanks! Could you pls add a vanilla infer code for chatgpt? |
Could you show this simple SFT code ? |
i have the same problem,too~Could you show this simple SFT code ? |
Hi @graciechen @wqw547243068 @cloudfool We have updated a lot. Please check the latest code and doc. |
📚 The doc issue
hello author!
I don't know training correspondence. Maybe My understanding is wrong 。
In /applications/ChatGPT)/examples/ , as far as I think:
'Train with dummy prompt data' is first step of chatGPT,
'Train the reward model' is second step of chatGPT,
but I dont't know the three step by RLHF using Pre-training language model with reward model,and what is about 'Train with real prompt data' step ?
The text was updated successfully, but these errors were encountered: