Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Original Dataset #7

Open
yin-hong opened this issue Sep 28, 2019 · 10 comments
Open

Original Dataset #7

yin-hong opened this issue Sep 28, 2019 · 10 comments

Comments

@yin-hong
Copy link

Hello! Can you share nyt and webnlg original dataset containing train, dev, test ? Thanks a lot !

@tsujuifu
Copy link
Owner

Hi Michael,
I get the dataset from here.

@yin-hong
Copy link
Author

Hi Michael,
I get the dataset from here.

Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem?

@tsujuifu
Copy link
Owner

tsujuifu commented Sep 29, 2019 via email

@yin-hong
Copy link
Author

Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:26寫道:

Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re. Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ .

Therefore, the loss function doesn't contain entity loss but only contain relation loss ?

@tsujuifu
Copy link
Owner

tsujuifu commented Sep 29, 2019 via email

@yin-hong
Copy link
Author

Noop, it contains both entity and relation loss. While for entity, i only care that a word belongs to (B, I, E, S, O). B: begin word of an entity I: inner word of an entity E: end word of an entity S: this word is a single-word entity O: this word does not belong to entity Hence, the entity loss is from 5-class classification. michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:46 寫道:

Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui michael-hon @.*** 於 2019年9月28日 週六 下午7:26寫道: … <#m_-4239379777234311174_> Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re. Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 <#7>?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ . Therefore, the loss function doesn't contain entity loss but only contain relation loss ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7?email_source=notifications&email_token=AJKWMAQULB42V2AECPB5O3LQMAJJTA5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GT3I#issuecomment-536242669>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMARMGGRXGQBM7NFJCH3QMAJJTANCNFSM4I3MRMBQ .

Thanks for your reply ! I think I have fully understood your thought.

@zhihuatao
Copy link

hello,could you please tell me how to realize the dataset pre_tr?

@Wangyandong-master
Copy link

hello,could you please tell me how to realize the dataset pre_tr?

Hello,have get the input files? Thank you lot.

@weizhepei
Copy link

Noop, it contains both entity and relation loss. While for entity, I only care that a word belongs to (B, I, E, S, O). B: begin word of an entity I: inner word of an entity E: end word of an entity S: this word is a single-word entity O: this word does not belong to entity Hence, the entity loss is from 5-class classification. michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:46 寫道:

Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui

@tsujuifu Thanks for the clarification. I'm trying to reproduce your excellent work but I have some trouble in the preparation of the dataset. I checked the preprocessed dataset released by CopyR [Zeng , 2018] and find the annotated entities are all single-word entities. In this case, should all the entity tags belong to 'B' when I prepare the training data for the Graph_rel model? Is there any plan to open the preprocessed dataset?

@131250208
Copy link

Hi~, CopyR uses the version only annotating the last word, do you also follow this preprocessing setting? Or do you preprocessing on the original dataset released by CopyR and annotating the whole span? Thanks for your reply~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants