Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于re 文档级别抽取预测结果的问题 #530

Closed
Danmo121 opened this issue Jun 13, 2024 · 4 comments
Closed

关于re 文档级别抽取预测结果的问题 #530

Danmo121 opened this issue Jun 13, 2024 · 4 comments
Labels
question Further information is requested

Comments

@Danmo121
Copy link

Danmo121 commented Jun 13, 2024

  1. 我使用现有的英文数据进行训练,在使用test数据结果预测时,结果中只有:
    "title": ,
    "h_idx": ,
    "t_idx": ,
    "r": "",
    但是没有evidence或者sent id,正常应该是得有evidence吧。请赐教,谢谢!
    Snipaste_2024-06-14_00-09-17
  2. 另外还有一个问题请教,在train.yaml文件中,有这两个参数:
    num_class: 97
    num_labels: 4
    其中,num_class代表了关系种类的数量,那么num_labels是什么的数量,有点不太明白,谢谢!
@Danmo121 Danmo121 added the question Further information is requested label Jun 13, 2024
@zxlzr
Copy link
Contributor

zxlzr commented Jun 14, 2024

您好,请问您使用的是deepke中哪个方法?麻烦您提供一些细节便于我们帮助您

@Danmo121
Copy link
Author

您好,请问您使用的是deepke中哪个方法?麻烦您提供一些细节便于我们帮助您
您好,就是使用re中DeepKE\example\re\document 篇章级别的关系抽取方法,就是使用咱们提供的DocRED数据。也是按照现有的代码正常训练和预测的。

@njcx-ai
Copy link
Collaborator

njcx-ai commented Jun 14, 2024

您好,非常感谢您对我们工作的关注。

  1. test数据结果预测的是实体对之间的关系,这里应该不涉及sent_id。
  2. num_class代表了关系种类的数量,num_labels表示的应该是论文中公式10-11中的balanced softmax类别,您可以在源码里看一下具体细节。

@Danmo121
Copy link
Author

感谢解答,关于num_labels我明白了,看了一下h_idx是头实体,t_idx是尾实体,r是关系。h_idx t_idx是指在vertexSet中的索引,之前误认为是pos[]了。再次感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants