We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者你好,请问为什么预训练阶段的训练数据也采用 prompt,response 的格式? 我理解预训练应该是做无监督的语言学习,直接喂入自然语言文本而不是喂入 prompt、response 格式的数据效果上有什么区别? 你当时这样整理预训练数据的原因是什么?
The text was updated successfully, but these errors were encountered:
T5是encoder-decoder架构的模型,属于条件生成(condition-generate),如果使用因果语言模型(causal LM)类似格式进行无监督预训练,要大改训练代码。
使用<prompt,response >格式进行预训练算是一种尝试吧,效果这块,我没有做过消融实验,无法下结论。
如果你需要decoder-only无监督预训练模型,请参考这个项目Phi2-mini-Chinese
整理text-to-text预训练数据的原因:受到微软phi系列模型的启发,phi系列模型使用高质量数据集进行预训练,效果很好,text-to-text高质量数据是最好获取的。
Sorry, something went wrong.
感谢回复!
No branches or pull requests
作者你好,请问为什么预训练阶段的训练数据也采用 prompt,response 的格式?
我理解预训练应该是做无监督的语言学习,直接喂入自然语言文本而不是喂入 prompt、response 格式的数据效果上有什么区别?
你当时这样整理预训练数据的原因是什么?
The text was updated successfully, but these errors were encountered: