Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About using the dataset NYT-FB #1

Closed
vhientran opened this issue Jan 6, 2021 · 2 comments
Closed

About using the dataset NYT-FB #1

vhientran opened this issue Jan 6, 2021 · 2 comments

Comments

@vhientran
Copy link

Hi @ttthy ,

Sorry for disturbing you. However, I wonder about using the dataset NYT-FB in your experiment. While TACRED test set provides the relation type for each sentence, I cannot find each relation type for each sentence in NYT-FB. I already got the NYT-FB dataset from Diego Marcheggiani, but most sentences are without relation type as (https://github.com/diegma/relation-autoencoder/blob/master/data-sample.txt). I wonder how to evaluate your system on NYT-FB without labels?

Thanks for your help!

@ttthy
Copy link
Owner

ttthy commented Jan 9, 2021

Hi @angelotran05 ,

Please find the statistics of positive sentences (labelled sentences) in our paper, Table 3 Appendix A.
There are 262 relation types in NYT-FB, "...2.1% of the sentences in NYT-FB were aligned against Freebase’s triplets" (page 3, section 3 Experiments and results, datasets).
All data are used during training, but only the labelled sentences are used for evaluation (7,793 and 33,808 sentences in dev and test set, respectively).
Let me know if you have other questions.

Best,

https://www.aclweb.org/anthology/2020.acl-main.669.pdf

@vhientran
Copy link
Author

Hi @ttthy ,

I got it. Thank you very much for your help.

All the best,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants