Where to get Support Documents for Cross-Domain Test? #1

davidleejy · 2022-06-30T06:06:44Z

Hello Nicholas & Michael,

Your paper presents an thoughtful benchmark for doc-level RE and I look forward to trying it out. Would be great if you could clarify a few simple questions:

Are the support documents for the cross-domain test set (comprising solely of SciERC samples) sampled from your Train+Dev set (62+16 relation types from DocRed)?
If the answer to question (1) is yes, is this not zero-shot with the setting that shot is defined as relation types trained/encountered?
Just to be certain, where are the support docs for the in-domain test set (16 RT from DocRED) sampled from? (Train+Dev set or in-domain test set)?

Figure illustrating my understanding of the train-test setup in this work:

nicpopovic · 2022-07-01T07:19:44Z

Hi David,

Thank you for your interest and your questions!

Are the support documents for the cross-domain test set (comprising solely of SciERC samples) sampled from your Train+Dev set (62+16 relation types from DocRed)?

The support documents for the cross-domain test set are sampled from the cross-domain test set itself. Sampling support documents from the in-domain set would not be possible (at least for the task format we chose) because the support documents need to contain instances of the relations which are to be extracted from the query documents. Since there is no overlap¹ between relation types in the in-domain sets and the cross-domain set, a task in which the support and query documents are sampled from separate sets is not possible for the two datasets.

If the answer to question (1) is yes, is this not zero-shot with the setting that shot is defined as relation types trained/encountered?

The answer to (1) is no, but if it where yes, it would sound to me more like a zero-shot setting than a few-shot setting, yes. However, I am not sure the support documents would be useful/usable input for such a task, as they would contain information about neither the query domain nor the relation types which are to be extracted.

Just to be certain, where are the support docs for the in-domain test set (16 RT from DocRED) sampled from? (Train+Dev set or in-domain test set)?

The support documents for the in-domain test set are sampled from the in-domain test set. In general, support and query documents will always be sampled from the same set. This is because the support and query documents need to be annotated with the same relation types. Note that during testing, the model does not perform any (persistent) learning on the test episodes.

I hope that I was able to answer your questions?

Nicholas

As mentioned in the paper, there technically are 2 relation types which are contained in both DocRED and SciERC, but we remove these from the in-domain set. ↩

davidleejy · 2022-07-01T09:10:37Z

These are very clear answers. Thank you for the elaboration, and congrats on the acceptance to NAACL.

davidleejy changed the title ~~Support Documents for Cross-Domain Test? and other questions~~ Where to get Support Documents for Cross-Domain Test? Jun 30, 2022

davidleejy closed this as completed Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where to get Support Documents for Cross-Domain Test? #1

Where to get Support Documents for Cross-Domain Test? #1

davidleejy commented Jun 30, 2022 •

edited

Loading

nicpopovic commented Jul 1, 2022

davidleejy commented Jul 1, 2022

Where to get Support Documents for Cross-Domain Test? #1

Where to get Support Documents for Cross-Domain Test? #1

Comments

davidleejy commented Jun 30, 2022 • edited Loading

nicpopovic commented Jul 1, 2022

Footnotes

davidleejy commented Jul 1, 2022

davidleejy commented Jun 30, 2022 •

edited

Loading