New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

关于语料 #36

Closed

ZenXir opened this issue Apr 4, 2023 · 1 comment

ZenXir commented Apr 4, 2023 •

edited

训练多轮对话时，提供的语料是这样的：
User:xxx 和 GPT:xxx 的对话
User:xxx 和 ChatGPT:xxx 的对话
User:xxx 和 Assistant:xxx 的对话
Human:xxx 和 Assistant:xxx 的对话

角色设定不统一，这样会对对多轮训练产生影响吗？需要统一角色设定不？
如果不需要统一，是不是只要是多轮对话语料，保证格式对就可以直接合在一起了？

Owner

Facico commented Apr 4, 2023

在finetune的全局instruction不指定哪个角色是什么（类似interaction、chat中的对话场景），理论上用各种不同角色鲁棒性会更强一点。不过在语料不是狠多的情况，我觉得角色统一可能效果会好一点。

Facico closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment