Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce the results of SDP #1

Closed
wangxinyu0922 opened this issue Sep 9, 2020 · 2 comments
Closed

Unable to reproduce the results of SDP #1

wangxinyu0922 opened this issue Sep 9, 2020 · 2 comments

Comments

@wangxinyu0922
Copy link

wangxinyu0922 commented Sep 9, 2020

Hi, I tried to reproduce the results of SDP (+BERT) in the paper through my code, but I cannot reproduce the results on PAS test sets and PSD ID test set. I use glove embedding + lemma + pos tagging + BERT base embedding with the biaffine parser and same weights of arc and label loss in my experiments and I got:

PAS ID PAS OOD PSD ID
94.80 92.78 80.86

The results are significantly lower than your reported results. Then I tried to use the interpolation (0.025) following previous work of Dozat:

PAS ID PAS OOD PSD ID
95.29 93.65 83.02

The results are still significantly inferior to the accuracy reported in your paper. I found my results are similar to the results reported in the paper of ACL2020. Can you give any suggestions about how to fill the 0.8 F1 gaps on PAS datasets and 3.8 F1 gaps on PSD in-domain datasets?

@hankcs
Copy link
Collaborator

hankcs commented Sep 9, 2020

Hi, I looked into the processing script and found the reason. We were using semstr==1.1.0 when we did the experiments in late 2018. This semstr version added orphan relations from all orphans to the root:

https://github.com/danielhers/semstr/blob/da1d48fe973e7d675d3644e4a8fd0e1ac2ba68f1/semstr/conversion/dep.py#L554-L555

With the lastest semstr, this behavior has been deprecated which results in totally different corpora.

  • semstr==1.1.0.conllu
1	Consumers	consumer	NNS	NNS	_	2	aux_ARG1	2:aux_ARG1|3:verb_ARG1|5:verb_ARG1	_
2	may	may	MD	MD	_	0	root	0:root	_
3	want	want	VB	VB	_	2	orphan	2:orphan	_
4	to	to	TO	TO	_	2	orphan	2:orphan	_
5	move	move	VB	VB	_	3	verb_ARG2	3:verb_ARG2|4:comp_ARG1|10:adj_ARG1	_
6	their	their	PRP$	PRP$	_	2	orphan	2:orphan	_
7	telephones	telephone	NNS	NNS	_	5	verb_ARG2	5:verb_ARG2|6:det_ARG1	_
8	a	a	DT	DT	_	2	orphan	2:orphan	_
9	little	little	RB	RB	_	8	det_ARG1	8:det_ARG1	_
10	closer	close	RBR	RBR	_	9	noun_ARG1	9:noun_ARG1|11:prep_ARG1	_
11	to	to	TO	TO	_	2	orphan	2:orphan	_
12	the	the	DT	DT	_	2	orphan	2:orphan	_
13	TV	tv	NN	NN	_	2	orphan	2:orphan	_
14	set	set	NN	NN	_	11	prep_ARG2	11:prep_ARG2|12:det_ARG1|13:noun_ARG1	_
15	.	_	.	.	_	2	orphan	2:orphan	_
  • semstr==1.2.2.conllu
1	Consumers	consumer	_	NNS	_	3	nsubj	2:aux_ARG1|3:verb_ARG1|5:verb_ARG1	_
2	may	may	_	MD	_	3	aux	_	_
3	want	want	_	VB	_	0	root	2:aux_ARG2	_
4	to	to	_	TO	_	5	aux	_	_
5	move	move	_	VB	_	3	xcomp	3:verb_ARG2|4:comp_ARG1|10:adj_ARG1	_
6	their	their	_	PRP$	_	7	poss	_	_
7	telephones	telephone	_	NNS	_	5	dobj	5:verb_ARG2|6:det_ARG1	_
8	a	a	_	DT	_	9	det	_	_
9	little	little	_	RB	_	10	npadvmod	8:det_ARG1	_
10	closer	close	_	RBR	_	5	advmod	9:noun_ARG1|11:prep_ARG1	_
11	to	to	_	TO	_	10	prep	_	_
12	the	the	_	DT	_	14	det	_	_
13	TV	tv	_	NN	_	14	nn	_	_
14	set	set	_	NN	_	11	pobj	11:prep_ARG2|12:det_ARG1|13:noun_ARG1	_
15	.	_	_	.	_	3	punct	_	_

Due to this reason, our results might not be directly comparable to the papers you mentioned if their preprocessing are different. We should have noted the version in the REAME too.

@wangxinyu0922
Copy link
Author

That's great! This information is very helpful! Thank you for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants