New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
测试集中的负样本的生成方式似乎有bug #6
Comments
嗯,谢谢指正。我会在后面抽一点时间重新修改测试集负样本的抽取以及重新验证实验结果。谢谢。 |
下面的工作是在generateEvaNegative函数中,规避了所有的正样本,包括train, val,和test中,code如下:
接下来是训练过程展示
接下来是原先的code迭代20次的结果,
综合比较两个训练过程,我们可以验证,在抽取测试数据时,如果规避所有的正样本,确实有利于提升模型的效果,谢谢指教。 |
感谢您的回复! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
作者您好!
在DataModule.py的generateEvaNegative函数里,对于某个用户,测试集里针对他随机生成的负样本应该同时避开他训练集和测试集里的正样本。但generateEvaNegative函数里的hash_data仅能指示当前样本是否为测试集里的正样本。这会导致训练集里的正样本有可能被采样成了测试集里的负样本。模型的实际性能会因此被低估。请问这个地方是不是有点bug?
The text was updated successfully, but these errors were encountered: