-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where to get Support Documents for Cross-Domain Test? #1
Comments
Hi David, Thank you for your interest and your questions!
The support documents for the cross-domain test set are sampled from the cross-domain test set itself. Sampling support documents from the in-domain set would not be possible (at least for the task format we chose) because the support documents need to contain instances of the relations which are to be extracted from the query documents. Since there is no overlap1 between relation types in the in-domain sets and the cross-domain set, a task in which the support and query documents are sampled from separate sets is not possible for the two datasets.
The answer to (1) is no, but if it where yes, it would sound to me more like a zero-shot setting than a few-shot setting, yes. However, I am not sure the support documents would be useful/usable input for such a task, as they would contain information about neither the query domain nor the relation types which are to be extracted.
The support documents for the in-domain test set are sampled from the in-domain test set. In general, support and query documents will always be sampled from the same set. This is because the support and query documents need to be annotated with the same relation types. Note that during testing, the model does not perform any (persistent) learning on the test episodes. I hope that I was able to answer your questions? Nicholas Footnotes
|
These are very clear answers. Thank you for the elaboration, and congrats on the acceptance to NAACL. |
Hello Nicholas & Michael,
Your paper presents an thoughtful benchmark for doc-level RE and I look forward to trying it out. Would be great if you could clarify a few simple questions:
Are the support documents for the cross-domain test set (comprising solely of SciERC samples) sampled from your Train+Dev set (62+16 relation types from DocRed)?
If the answer to question (1) is yes, is this not zero-shot with the setting that shot is defined as relation types trained/encountered?
Just to be certain, where are the support docs for the in-domain test set (16 RT from DocRED) sampled from? (Train+Dev set or in-domain test set)?
Figure illustrating my understanding of the train-test setup in this work:
The text was updated successfully, but these errors were encountered: