Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to achieve published result in DailyDialogue #4

Open
bsantraigi opened this issue May 14, 2019 · 3 comments
Open

Unable to achieve published result in DailyDialogue #4

bsantraigi opened this issue May 14, 2019 · 3 comments

Comments

@bsantraigi
Copy link

bsantraigi commented May 14, 2019

Hi,
I am trying to retrain your model as a baseline, and till now SWDA gave the results as per the paper. actually, slightly better. But for the DailyDialog dataset, even after multiple runs the best we got is, (row1 is no validation, row2 on test set

A, E, G are for sim_bow
BLEU-R | BLEU-P | F1 | A | E | G
0.305 | 0.170 | 0.218 | 0.940 | 0.609 | 0.857
0.298 | 0.163 | 0.211 | 0.940 | 0.605 | 0.857

Whereas the paper mentions the best results to be

image

Was there any changes made to the code with respect to the configuration in the paper? I couldn't find any discrepancy. Can you point me to what might be the issue?

@guxd
Copy link
Owner

guxd commented May 16, 2019

Thanks for pointing out.
There seems to be a big deviation to the original results since recently.
Somebody reported better results than that reported in the paper for the DailyDial dataset.
We are not sure whether it is due to any change of environment other than those written in the "requirements.txt". We are figuring it out and will let you know.

@bsantraigi
Copy link
Author

Ok, thanks. Although we were using an environment as per the requirements.txt only. Also like you said, we also noticed quite a bit of variance between different runs. (Even when the seed is given as an argument)

@Bortrex
Copy link

Bortrex commented Aug 4, 2020

Also, in #2341 it was shown that the NLTK lib has some issues, related to the SmoothingFunction() and therefore received an update to fix it.
Hence, it is no longer possible to achieved the same results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants