Duplicate tuples and tuples with none tail node #11

phosseini · 2021-05-25T19:35:02Z

There are duplicate tuples in all three splits of data: ~68,626 in the train, ~7,410 in dev, and ~8,473 in test (please correct me if I'm wrong). I wonder why? And should we just ignore the duplicates when using the data? One example:

['PersonX answers the question', 'xAttr', 'knowledgeable']
['PersonX answers the question', 'xAttr', 'knowledgeable']

Also, there are tuples with none tail node value (these none valued tuples are also part of the duplicate tuples). For example,

['PersonX accidentally threw ___', 'xIntent', 'none']
['PersonX accidentally threw ___', 'xIntent', 'none']

I wonder how these none values should be interpreted? Should we just ignore them? Or, does it mean that the subject or head has no relation of type edge relation in the tuple? For instance, in the case of PersonX accidentally threw ___, PersonX has no xIntent? If that's the case, then how should we treat the following cases:

['PersonX accidently left', 'oReact', 'none']
['PersonX accidently left', 'oReact', 'sad']

Where we have the same relation one time with a none tail node and another time with a non-empty tail node.

Thanks.

The text was updated successfully, but these errors were encountered:

phosseini · 2021-06-21T17:53:10Z

There are duplicate tuples in all three splits of data: ~68,626 in the train, ~7,410 in dev, and ~8,473 in test (please correct me if I'm wrong). I wonder why? And should we just ignore the duplicates when using the data? One example:
['PersonX answers the question', 'xAttr', 'knowledgeable']
['PersonX answers the question', 'xAttr', 'knowledgeable']
Also, there are tuples with none tail node value (these none valued tuples are also part of the duplicate tuples). For example,
['PersonX accidentally threw ___', 'xIntent', 'none']
['PersonX accidentally threw ___', 'xIntent', 'none']
I wonder how these none values should be interpreted? Should we just ignore them? Or, does it mean that the subject or head has no relation of type edge relation in the tuple? For instance, in the case of PersonX accidentally threw ___, PersonX has no xIntent? If that's the case, then how should we treat the following cases:
['PersonX accidently left', 'oReact', 'none']
['PersonX accidently left', 'oReact', 'sad']
Where we have the same relation one time with a none tail node and another time with a non-empty tail node.

Thanks.

@keisks Any thoughts on this?

puraminy · 2021-06-21T18:15:58Z

I am not a member of this project, however I am also working on Atomic. I haven't noticed that. In my opinion certainly duplicate entries have no use and they are redundant or can even have negative effects.

As for none, they mean that there was no intention or no involuntary effect in case of xEffect or oEffect. In your example, since the person threw it accidentally, then none means he has no intention.

Also each head and relation can have multiple targets. They may have different confidence degree or plausibility.

I am a Ph.D student from Tehran university, I would like to know you more, and know what you are doing on Atomic. We may be able to share our thoughts, my email is : pouramini -------- gmail

csbhagav · 2021-09-15T20:24:19Z

Sorry for the delay in addressing the issues. We are looking at this and will respond soon.

rlebras · 2021-09-15T20:57:01Z

Thank you @phosseini and @puraminy for your interest in our work!

As @puraminy mentioned, the tail node none indicates that the relation does not apply to the given head (e.g., if an event does not affect people other than PersonX, the tails for the oEffect, oReact, and oWant would be annotated as none - see Sap et al., 2019).

Duplicate tuples indicate that multiple annotators provided the same tail for a given head/relation pair. While these tuples are redundant, keeping them in the data allows to accurately reflect the data collection process and can be used to leverage the degree of confidence in these tuples.

cingtiye · 2022-11-22T03:31:17Z

Hi @rlebras

As @puraminy mentioned, there are tuples with different tail node values, while their head and relation are the same. For example,

["personX 'd better go", 'xAttr', 'avoidant']
["PersonX 'd better go", 'xAttr', 'weak']
["PersonX 'd better go", 'xAttr', 'hurried']
["PersonX 'd better go", 'xAttr', 'late']
["PersonX 'd better go", 'xAttr', 'Tardy']
["PersonX 'd better go", 'xAttr', 'busy']

Is it really neccessary for the LM models, such as GPT-XL or BART, to generate multiply tail nodes values (avoidant [EOS] , weak [EOS] or hurried [EOS] etc.) for the same input ( PersonX 'd better go xAttr [GEN] )?

In my opinion, LM should not generate different outputs for the same input during training ((please correct me if I'm wrong).

Thanks.

csbhagav assigned rlebras Sep 15, 2021

rlebras closed this as completed Sep 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate tuples and tuples with none tail node #11

Duplicate tuples and tuples with none tail node #11

phosseini commented May 25, 2021 •

edited

phosseini commented Jun 21, 2021

puraminy commented Jun 21, 2021 •

edited

csbhagav commented Sep 15, 2021

rlebras commented Sep 15, 2021

cingtiye commented Nov 22, 2022 •

edited

Duplicate tuples and tuples with none tail node #11

Duplicate tuples and tuples with none tail node #11

Comments

phosseini commented May 25, 2021 • edited

phosseini commented Jun 21, 2021

puraminy commented Jun 21, 2021 • edited

csbhagav commented Sep 15, 2021

rlebras commented Sep 15, 2021

cingtiye commented Nov 22, 2022 • edited

phosseini commented May 25, 2021 •

edited

puraminy commented Jun 21, 2021 •

edited

cingtiye commented Nov 22, 2022 •

edited