Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Question About Performance #33

Closed
cocaer opened this issue Mar 8, 2019 · 1 comment
Closed

Question About Performance #33

cocaer opened this issue Mar 8, 2019 · 1 comment

Comments

@cocaer
Copy link

cocaer commented Mar 8, 2019

The paper shows the best en-fr bleu is 33.4. The readme.md shows
'epoch -> 7
valid_fr-en_mt_bleu -> 28.36
valid_en-fr_mt_bleu -> 30.50
test_fr-en_mt_bleu -> 34.02
test_en-fr_mt_bleu -> 36.62'.
Does this result from the max_len parameter which removes the long sentences from parallel test corpus?

@glample
Copy link
Contributor

glample commented Mar 8, 2019

No, the difference comes from the fact that the monolingual dataset is different. In the paper we use all NewsCrawl, in the Github we just use NewsCrawl 2013 and 2014 I believe, which is more in domain with newstest2014 on which we evaluate.

@cocaer cocaer closed this as completed Mar 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants