Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TGREDIAL数据集中的user2history文件里的"xxx/z":[yyyy,yyyyy,yyyyy,yyyyyyy,....] #77

Closed
LQlq123 opened this issue Jul 20, 2023 · 4 comments

Comments

@LQlq123
Copy link

LQlq123 commented Jul 20, 2023

请问,请问TGREDIAL数据集中的user2history文件里的"xxx/z":[yyyy,yyyyy,yyyyy,yyyyyyy,....],这里的xxx、z、yyyy是有什么含义,如果xxx是用户id,那么为什么和train_data中的user_id不对应,并且加上z之后的含义是什么,不是非常清楚,感谢解答!

@wxl1999
Copy link
Member

wxl1999 commented Jul 20, 2023

您好,xxx 应该是 conv_id,z 是 local_id(对应某一轮),yyyy 应该是 global_id(对应某个物品)

@LQlq123
Copy link
Author

LQlq123 commented Jul 20, 2023

感谢您的回复!但是我还是有个问题,如果xxx表示的是conv_id,那么经过和train_data.json对照,发现conv_id为0、1、10的时候,它所对应的user_id都为0,同一用户的list不一致,是需要取并集得到用户的最终交互记录吗?

@wxl1999
Copy link
Member

wxl1999 commented Jul 20, 2023

同一用户的list不一致,应该是因为时间,这个 list 对应原始交互数据中在当前对话发生之前的用户行为,不同对话的时间不一样,您可以再联系数据集作者确认一下

@LQlq123
Copy link
Author

LQlq123 commented Jul 20, 2023 via email

@LQlq123 LQlq123 closed this as completed Oct 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants