Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对话数据不需要整理成QA对吗? #1

Closed
standyyyy opened this issue Mar 7, 2024 · 2 comments
Closed

对话数据不需要整理成QA对吗? #1

standyyyy opened this issue Mar 7, 2024 · 2 comments

Comments

@standyyyy
Copy link

有些奇怪,最近也在看对话聊天记录微调方面,发现很多都是需要整理成QA对,但是实际上Q下一句并不一定是答案,因此很困扰数据该怎么整理

@xming521
Copy link
Owner

xming521 commented Mar 7, 2024

是需要整理成QA对,我提供了个脚本来整理,但是只能说是通过一些规则尽最大可能去组成QA对,还是会有一些脏数据,和聊天人的习惯有关,总数据量大的话有一些错误的QA对也能接受

@standyyyy
Copy link
Author

好的好的,谢谢回复啦,我上周查阅了有什么办法或者规则去从下文中找到最有可能是回答的那句话 。发现推荐的方法也是使用相似度去匹配,来判断两句话的相关性,进而确认是不是它的回答 ,也有看到一个挺有意思的办法,就是如果一个人说多句话,用bert来判断三句话的相关性,来判断是否合并成一句话,这样就减少了一些话术的,但是综合还没找到一种很棒的处理对话数据的办法。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants